INSY 695: Final Group Project - What Makes a Popular TED Talk?

Arnaud Guzman-Annès | ID: 260882529

Ram Babu | ID: 260958970

Sophie Courtemanche-Martel | ID: 260743568

Duncan Wang | ID: 260710229

Jules Zielinski | ID: 260760796



Date: February 22nd, 2021

Objective:

TED talks are video recordings of influential talks given at and hosted by TED Conferences LLC. TED was founded in 1984, and has since built a reputation for spreading inspiring powerful ideas in fields ranging from tech to science to education. As video recordings of TED talks have garnered over 1 billion views to date, it is evident that TED represents a significant platform and opportunity for anyone with a powerful mission to raise awareness and attention to their work.

Hypothesis

We hypothesized that due to the wide range of data types and distributions, a tree-based algorithm such as Gradient Boosting Regressor would perform most optimally at prediction. We further hypothesized that TED talk viewers are drawn to or away from a talk based on its content, such as its topic, theme, or speaker, rather than other attributes such as how long it is or when it was released.


I - Data preprocessing

Setup

First, let's import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures. We also check that Python 3.5 or later is installed as well as Scikit-Learn (V ≥0.20).

In [1]:
# Python ≥3.5 is required
import sys
assert sys.version_info >= (3, 5)

# Scikit-Learn ≥0.20 is required
import sklearn
assert sklearn.__version__ >= "0.20"

# Common imports
import numpy as np
import os

import pandas as pd

# To plot pretty figures
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rc('axes', labelsize=14)
mpl.rc('xtick', labelsize=12)
mpl.rc('ytick', labelsize=12)

# Where to save the figures
PROJECT_ROOT_DIR = "."
CHAPTER_ID = "end_to_end_project"
IMAGES_PATH = os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID)
os.makedirs(IMAGES_PATH, exist_ok=True)

def save_fig(fig_id, tight_layout=True, fig_extension="png", resolution=300):
    path = os.path.join(IMAGES_PATH, fig_id + "." + fig_extension)
    print("Saving figure", fig_id)
    if tight_layout:
        plt.tight_layout()
    plt.savefig(path, format=fig_extension, dpi=resolution)

# Ignore useless warnings (see SciPy issue #5998)
import warnings
warnings.filterwarnings(action="ignore", message="^internal gelsd")
pd.options.mode.chained_assignment = None  # default='warn'

#display all columns
pd.set_option('display.max_columns', None)

Get the data

From GitHub repository

In [2]:
import pandas as pd
import requests
import io
    
# Downloading the csv file from my GitHub account
url = "https://raw.githubusercontent.com/McGill-MMA-EnterpriseAnalytics/TED/main/data/ted_main.csv"

download = requests.get(url).content

# Reading the downloaded content and turning it into a pandas dataframe
# We will use "churn" instead of "df" for replication purposes
df = pd.read_csv(io.StringIO(download.decode('utf-8')))

# Printing out the first 5 rows of the dataframe
df.head()
Out[2]:
comments description duration event film_date languages main_speaker name num_speaker published_date ratings related_talks speaker_occupation tags title url views
0 4553 Sir Ken Robinson makes an entertaining and pro... 1164 TED2006 1140825600 60 Ken Robinson Ken Robinson: Do schools kill creativity? 1 1151367060 [{'id': 7, 'name': 'Funny', 'count': 19645}, {... [{'id': 865, 'hero': 'https://pe.tedcdn.com/im... Author/educator ['children', 'creativity', 'culture', 'dance',... Do schools kill creativity? https://www.ted.com/talks/ken_robinson_says_sc... 47227110
1 265 With the same humor and humanity he exuded in ... 977 TED2006 1140825600 43 Al Gore Al Gore: Averting the climate crisis 1 1151367060 [{'id': 7, 'name': 'Funny', 'count': 544}, {'i... [{'id': 243, 'hero': 'https://pe.tedcdn.com/im... Climate advocate ['alternative energy', 'cars', 'climate change... Averting the climate crisis https://www.ted.com/talks/al_gore_on_averting_... 3200520
2 124 New York Times columnist David Pogue takes aim... 1286 TED2006 1140739200 26 David Pogue David Pogue: Simplicity sells 1 1151367060 [{'id': 7, 'name': 'Funny', 'count': 964}, {'i... [{'id': 1725, 'hero': 'https://pe.tedcdn.com/i... Technology columnist ['computers', 'entertainment', 'interface desi... Simplicity sells https://www.ted.com/talks/david_pogue_says_sim... 1636292
3 200 In an emotionally charged talk, MacArthur-winn... 1116 TED2006 1140912000 35 Majora Carter Majora Carter: Greening the ghetto 1 1151367060 [{'id': 3, 'name': 'Courageous', 'count': 760}... [{'id': 1041, 'hero': 'https://pe.tedcdn.com/i... Activist for environmental justice ['MacArthur grant', 'activism', 'business', 'c... Greening the ghetto https://www.ted.com/talks/majora_carter_s_tale... 1697550
4 593 You've never seen data presented like this. Wi... 1190 TED2006 1140566400 48 Hans Rosling Hans Rosling: The best stats you've ever seen 1 1151440680 [{'id': 9, 'name': 'Ingenious', 'count': 3202}... [{'id': 2056, 'hero': 'https://pe.tedcdn.com/i... Global health expert; data visionary ['Africa', 'Asia', 'Google', 'demo', 'economic... The best stats you've ever seen https://www.ted.com/talks/hans_rosling_shows_t... 12005869

Data preprocessing

About the original dataset

  • name: The official name of the TED Talk. Includes the title and the speaker.
  • title: The title of the talk
  • description: A blurb of what the talk is about.
  • main_speaker: The first named speaker of the talk.
  • speaker_occupation: The occupation of the main speaker.
  • num_speaker: The number of speakers in the talk.
  • duration: The duration of the talk in seconds.
  • event: The TED/TEDx event where the talk took place.
  • film_date: The Unix timestamp of the filming.
  • published_date: The Unix timestamp for the publication of the talk on TED.com
  • comments: The number of first level comments made on the talk.
  • tags: The themes associated with the talk.
  • languages: The number of languages in which the talk is available.
  • ratings: A stringified dictionary of the various ratings given to the talk (inspiring, fascinating, jaw dropping, etc.)
  • related_talks: A list of dictionaries of recommended talks to watch next.
  • url: The URL of the talk.
  • views: The number of views on the talk.
In [3]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2550 entries, 0 to 2549
Data columns (total 17 columns):
 #   Column              Non-Null Count  Dtype 
---  ------              --------------  ----- 
 0   comments            2550 non-null   int64 
 1   description         2550 non-null   object
 2   duration            2550 non-null   int64 
 3   event               2550 non-null   object
 4   film_date           2550 non-null   int64 
 5   languages           2550 non-null   int64 
 6   main_speaker        2550 non-null   object
 7   name                2550 non-null   object
 8   num_speaker         2550 non-null   int64 
 9   published_date      2550 non-null   int64 
 10  ratings             2550 non-null   object
 11  related_talks       2550 non-null   object
 12  speaker_occupation  2544 non-null   object
 13  tags                2550 non-null   object
 14  title               2550 non-null   object
 15  url                 2550 non-null   object
 16  views               2550 non-null   int64 
dtypes: int64(7), object(10)
memory usage: 338.8+ KB
In [4]:
# Reorganize the columns for a better visualization

df = df[['name', 'title', 'description', 'main_speaker', 'speaker_occupation', 'num_speaker', 'duration', 'event', 'film_date', 'published_date', 'comments', 'tags', 'languages', 'ratings', 'related_talks', 'url', 'views']]
df.head()
Out[4]:
name title description main_speaker speaker_occupation num_speaker duration event film_date published_date comments tags languages ratings related_talks url views
0 Ken Robinson: Do schools kill creativity? Do schools kill creativity? Sir Ken Robinson makes an entertaining and pro... Ken Robinson Author/educator 1 1164 TED2006 1140825600 1151367060 4553 ['children', 'creativity', 'culture', 'dance',... 60 [{'id': 7, 'name': 'Funny', 'count': 19645}, {... [{'id': 865, 'hero': 'https://pe.tedcdn.com/im... https://www.ted.com/talks/ken_robinson_says_sc... 47227110
1 Al Gore: Averting the climate crisis Averting the climate crisis With the same humor and humanity he exuded in ... Al Gore Climate advocate 1 977 TED2006 1140825600 1151367060 265 ['alternative energy', 'cars', 'climate change... 43 [{'id': 7, 'name': 'Funny', 'count': 544}, {'i... [{'id': 243, 'hero': 'https://pe.tedcdn.com/im... https://www.ted.com/talks/al_gore_on_averting_... 3200520
2 David Pogue: Simplicity sells Simplicity sells New York Times columnist David Pogue takes aim... David Pogue Technology columnist 1 1286 TED2006 1140739200 1151367060 124 ['computers', 'entertainment', 'interface desi... 26 [{'id': 7, 'name': 'Funny', 'count': 964}, {'i... [{'id': 1725, 'hero': 'https://pe.tedcdn.com/i... https://www.ted.com/talks/david_pogue_says_sim... 1636292
3 Majora Carter: Greening the ghetto Greening the ghetto In an emotionally charged talk, MacArthur-winn... Majora Carter Activist for environmental justice 1 1116 TED2006 1140912000 1151367060 200 ['MacArthur grant', 'activism', 'business', 'c... 35 [{'id': 3, 'name': 'Courageous', 'count': 760}... [{'id': 1041, 'hero': 'https://pe.tedcdn.com/i... https://www.ted.com/talks/majora_carter_s_tale... 1697550
4 Hans Rosling: The best stats you've ever seen The best stats you've ever seen You've never seen data presented like this. Wi... Hans Rosling Global health expert; data visionary 1 1190 TED2006 1140566400 1151440680 593 ['Africa', 'Asia', 'Google', 'demo', 'economic... 48 [{'id': 9, 'name': 'Ingenious', 'count': 3202}... [{'id': 2056, 'hero': 'https://pe.tedcdn.com/i... https://www.ted.com/talks/hans_rosling_shows_t... 12005869
In [5]:
# Some more information about the dataset

display(df.shape)
display(df.isnull().sum())
display(df.describe())
(2550, 17)
name                  0
title                 0
description           0
main_speaker          0
speaker_occupation    6
num_speaker           0
duration              0
event                 0
film_date             0
published_date        0
comments              0
tags                  0
languages             0
ratings               0
related_talks         0
url                   0
views                 0
dtype: int64
num_speaker duration film_date published_date comments languages views
count 2550.000000 2550.000000 2.550000e+03 2.550000e+03 2550.000000 2550.000000 2.550000e+03
mean 1.028235 826.510196 1.321928e+09 1.343525e+09 191.562353 27.326275 1.698297e+06
std 0.207705 374.009138 1.197391e+08 9.464009e+07 282.315223 9.563452 2.498479e+06
min 1.000000 135.000000 7.464960e+07 1.151367e+09 2.000000 0.000000 5.044300e+04
25% 1.000000 577.000000 1.257466e+09 1.268463e+09 63.000000 23.000000 7.557928e+05
50% 1.000000 848.000000 1.333238e+09 1.340935e+09 118.000000 28.000000 1.124524e+06
75% 1.000000 1046.750000 1.412964e+09 1.423432e+09 221.750000 33.000000 1.700760e+06
max 5.000000 5256.000000 1.503792e+09 1.506092e+09 6404.000000 72.000000 4.722711e+07

Initial Observations

  • There are 2550 rows and 17 columns.
  • speaker_occupation column have 6 missing values.
In [6]:
# working on df_copy for the rest of data exploration

df_copy = df.copy()
In [7]:
# Filtering out the TED talks
df_copy = df_copy[df_copy['event'].str.contains('TED', regex=False, case=False, na=False)]
df_copy.shape
Out[7]:
(2439, 17)
In [8]:
# Converting Duration to minutes
df_copy["duration"] = round(df_copy["duration"]/60,2)
In [9]:
df_copy.fillna('Unknown', inplace = True)
In [10]:
df_copy['languages'].describe()
Out[10]:
count    2439.000000
mean       27.703977
std         9.205526
min         0.000000
25%        23.000000
50%        28.000000
75%        33.000000
max        72.000000
Name: languages, dtype: float64

Observations with zero languages are musicals

In [11]:
df_copy[df_copy['languages'] == 0].head(3)
Out[11]:
name title description main_speaker speaker_occupation num_speaker duration event film_date published_date comments tags languages ratings related_talks url views
58 Pilobolus: A dance of "Symbiosis" A dance of "Symbiosis" Two Pilobolus dancers perform "Symbiosis." Doe... Pilobolus Dance company 1 13.75 TED2005 1109289600 1170979860 222 ['dance', 'entertainment', 'nature', 'performa... 0 [{'id': 1, 'name': 'Beautiful', 'count': 1810}... [{'id': 40, 'hero': 'https://pe.tedcdn.com/ima... https://www.ted.com/talks/pilobolus_perform_sy... 3051507
115 Ethel: A string quartet plays "Blue Room" A string quartet plays "Blue Room" The avant-garde string quartet Ethel performs ... Ethel String quartet 1 3.57 TED2006 1138838400 1182184140 27 ['cello', 'collaboration', 'culture', 'enterta... 0 [{'id': 1, 'name': 'Beautiful', 'count': 216},... [{'id': 103, 'hero': 'https://pe.tedcdn.com/im... https://www.ted.com/talks/ethel_performs_blue_... 384641
135 Vusi Mahlasela: "Woza" "Woza" After Vusi Mahlasela's 3-song set at TEDGlobal... Vusi Mahlasela Musician, activist 1 4.98 TEDGlobal 2007 1181260800 1187695440 36 ['Africa', 'entertainment', 'guitar', 'live mu... 0 [{'id': 8, 'name': 'Informative', 'count': 4},... [{'id': 158, 'hero': 'https://pe.tedcdn.com/im... https://www.ted.com/talks/vusi_mahlasela_s_enc... 416603
In [12]:
#Indexing the rows
df_copy.reset_index(inplace=True)
In [13]:
from collections import defaultdict
rating_data = defaultdict(list)
In [14]:
import ast
rating_names = set()
for index, row in df_copy.iterrows():
    rating = ast.literal_eval(row['ratings'])
    for item in rating:
        rating_names.add(item['name'])
In [15]:
rating_names
Out[15]:
{'Beautiful',
 'Confusing',
 'Courageous',
 'Fascinating',
 'Funny',
 'Informative',
 'Ingenious',
 'Inspiring',
 'Jaw-dropping',
 'Longwinded',
 'OK',
 'Obnoxious',
 'Persuasive',
 'Unconvincing'}
In [16]:
#Extracting ratings

rating_data = defaultdict(list)
for index, row in df_copy.iterrows():
    rating = ast.literal_eval(row['ratings'])
    rating_data['ID'].append(row['index'])
    names = set()
    for item in rating:
        rating_data[item['name']].append(item['count'])
        names.add(item['name'])

rating_data = pd.DataFrame(rating_data)

rating_data.head()
Out[16]:
ID Funny Beautiful Ingenious Courageous Longwinded Confusing Informative Fascinating Unconvincing Persuasive Jaw-dropping OK Obnoxious Inspiring
0 0 19645 4573 6073 3253 387 242 7346 10581 300 10704 4439 1174 209 24924
1 1 544 58 56 139 113 62 443 132 258 268 116 203 131 413
2 2 964 60 183 45 78 27 395 166 104 230 54 146 142 230
3 3 59 291 105 760 53 32 380 132 36 460 230 85 35 1070
4 4 1390 942 3202 318 110 72 5433 4606 67 2542 3736 248 61 2893
In [17]:
# Extracting tags

tags_data = defaultdict(list)
for index, row in df_copy.iterrows():
    tags = ast.literal_eval(row['tags'])
    for item in tags:
        tags_data['ID'].append(row['index'])
        tags_data['tags'].append(item)

tags_data = pd.DataFrame(tags_data)
In [18]:
tags_data[tags_data['ID']==1]
Out[18]:
ID tags
7 1 alternative energy
8 1 cars
9 1 climate change
10 1 culture
11 1 environment
12 1 global issues
13 1 science
14 1 sustainability
15 1 technology
In [19]:
# Extracting related talks

df_copy['related_views'] = 0
df_copy['related_duration'] = 0
for index, row in df_copy.iterrows():
    rel = row['related_talks'].split(',')
    ctr1 = 0
    tot1 = 0
    ctr2 = 0
    tot2 = 0
    for views in rel:
        if 'viewed_count' in views:
            view = views.split(':')
            view[1] = view[1].replace("]", "")
            view[1] = view[1].replace(" ", "")
            view[1] = view[1].replace("}", "")
            tot1+=int(view[1])
            ctr1+=1
        if 'duration' in views:
            view = views.split(':')
            view[1] = view[1].replace("]", "")
            view[1] = view[1].replace(" ", "")
            view[1] = view[1].replace("}", "")
            tot2+=int(view[1])
            ctr2+=1
    df_copy['related_views'][index] = tot1/ctr1
    df_copy['related_duration'][index] = tot2/ctr2
In [20]:
df_copy.head(3)
Out[20]:
index name title description main_speaker speaker_occupation num_speaker duration event film_date published_date comments tags languages ratings related_talks url views related_views related_duration
0 0 Ken Robinson: Do schools kill creativity? Do schools kill creativity? Sir Ken Robinson makes an entertaining and pro... Ken Robinson Author/educator 1 19.40 TED2006 1140825600 1151367060 4553 ['children', 'creativity', 'culture', 'dance',... 60 [{'id': 7, 'name': 'Funny', 'count': 19645}, {... [{'id': 865, 'hero': 'https://pe.tedcdn.com/im... https://www.ted.com/talks/ken_robinson_says_sc... 47227110 3027062 921
1 1 Al Gore: Averting the climate crisis Averting the climate crisis With the same humor and humanity he exuded in ... Al Gore Climate advocate 1 16.28 TED2006 1140825600 1151367060 265 ['alternative energy', 'cars', 'climate change... 43 [{'id': 7, 'name': 'Funny', 'count': 544}, {'i... [{'id': 243, 'hero': 'https://pe.tedcdn.com/im... https://www.ted.com/talks/al_gore_on_averting_... 3200520 1118767 1096
2 2 David Pogue: Simplicity sells Simplicity sells New York Times columnist David Pogue takes aim... David Pogue Technology columnist 1 21.43 TED2006 1140739200 1151367060 124 ['computers', 'entertainment', 'interface desi... 26 [{'id': 7, 'name': 'Funny', 'count': 964}, {'i... [{'id': 1725, 'hero': 'https://pe.tedcdn.com/i... https://www.ted.com/talks/david_pogue_says_sim... 1636292 1846195 915
In [21]:
df_copy['event_category'] = 'Other'

for i in range(len(df_copy)):
    if df_copy['event'][i][0:5]=='TED20':
        df_copy['event_category'][i] = 'TED2000s'
    elif df_copy['event'][i][0:5]=='TED19':
        df_copy['event_category'][i] = 'TED1900s'
    elif df_copy['event'][i][0:4]=='TEDx':
        df_copy['event_category'][i] = "TEDx"
    elif df_copy['event'][i][0:7]=='TED@BCG':
        df_copy['event_category'][i] = 'TED@BCG'
    elif df_copy['event'][i][0:4]=='TED@':
        df_copy['event_category'][i] = "TED@"
    elif df_copy['event'][i][0:8]=='TEDSalon':
        df_copy['event_category'][i] = "TEDSalon"
    elif df_copy['event'][i][0:9]=='TEDGlobal':
        df_copy['event_category'][i] = 'TEDGlobal'
    elif df_copy['event'][i][0:8]=='TEDWomen':
        df_copy['event_category'][i] = 'TEDWomen'
    elif df_copy['event'][i][0:6]=='TEDMED':
        df_copy['event_category'][i] = 'TEDMED'
    elif df_copy['event'][i][0:3]=='TED':
        df_copy['event_category'][i] = 'TEDOther'
In [22]:
# Convert timestamp into readable format

import datetime

df_copy['published_date'] = df_copy['published_date'].apply(lambda x: datetime.date.fromtimestamp(int(x)))
df_copy['day'] = df_copy['published_date'].apply(lambda x: x.weekday())
df_copy['month'] = df_copy['published_date'].apply(lambda x: x.month)
df_copy['year'] = df_copy['published_date'].apply(lambda x: x.year)
df_copy['film_date'] = df_copy['film_date'].apply(lambda x: datetime.date.fromtimestamp(int(x)))
df_copy['day_film'] = df_copy['film_date'].apply(lambda x: x.weekday())
df_copy['month_film'] = df_copy['film_date'].apply(lambda x: x.month)
df_copy['year_film'] = df_copy['film_date'].apply(lambda x: x.year)
In [23]:
to_cat = {"day":   {0: "Monday", 1: "Tuesday", 2: "Wednesday", 3: "Thurday", 4: "Friday", 5: "Saturday",
                    6: "Sunday" },
          "day_film":   {0: "Monday", 1: "Tuesday", 2: "Wednesday", 3: "Thurday", 4: "Friday", 5: "Saturday",
                    6: "Sunday" }}

df_copy.replace(to_cat, inplace=True)
In [24]:
#create new attributes for length of the title, description, and for number of times the speaker spoke
df_copy['title_len']  = df_copy['title'].str.len()
df_copy['description_len'] = df_copy['description'].str.len()
df_copy['speaker_frequency'] = df_copy.groupby('main_speaker')['index'].transform('count')
df_copy['repeat_speaker'] = np.where((df_copy['speaker_frequency'] >1),1,0)
In [25]:
temp = tags_data.groupby(['tags']).count()
temp = temp.sort_values(by='ID',ascending=False)
In [26]:
temp.head(3)
Out[26]:
ID
tags
technology 701
science 534
global issues 478
In [27]:
# Creating Tag Categories

df_copy['Technology/Science'] = 0
df_copy['Humanity'] = 0
df_copy['Global Issues'] = 0
df_copy['Art/Creativity'] = 0
df_copy['Business'] = 0
df_copy['Entertainment'] = 0
df_copy['Health'] = 0
df_copy['Communication'] = 0
df_copy['Education']=0

Tech = ['technology','future','comuters','science','invention','research']
Humanity = ['community','society','social change','humanity','culture']
Global_Issues = ['global issues','activism','politics','inequality','environment','climate change']
Art = ['design','art','innovation','creativity','brain']
Business = ['business','economics']
Entertainment = ['entertainment','media','sports']
Health = ['health','biollogy','medicine','health care','medical research']
Communication = ['communication','collaboration']
Education = ['children','education','teaching','parenting']


for i in range(len(tags_data)):
    index = tags_data['ID'][i]
    if tags_data['tags'][i] in Tech:
        df_copy['Technology/Science'][index]=1
    if tags_data['tags'][i] in Humanity:
        df_copy['Humanity'][index]=1
    if tags_data['tags'][i] in Global_Issues:
        df_copy['Global Issues'][index]=1
    if tags_data['tags'][i] in Art:
        df_copy['Art/Creativity'][index]=1
    if tags_data['tags'][i] in Business:
        df_copy['Business'][index]=1
    if tags_data['tags'][i] in Entertainment:
        df_copy['Entertainment'][index]=1
    if tags_data['tags'][i] in Health:
        df_copy['Health'][index]=1
    if tags_data['tags'][i] in Communication:
        df_copy['Communication'][index]=1
    if tags_data['tags'][i] in Education:
        df_copy['Education'][index]=1
In [28]:
df_copy = df_copy.drop(['index','comments', 'event', 'film_date', 'main_speaker', 'name', 'published_date', 'ratings', 'url', 'description', 'title',
           'related_talks', 'tags', 'title','speaker_occupation'], 1)
In [29]:
df_copy.head()
Out[29]:
num_speaker duration languages views related_views related_duration event_category day month year day_film month_film year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment Health Communication Education
0 1 19.40 60 47227110 3027062 921 TED2000s Monday 6 2006 Friday 2 2006 27 149 3 1 0 1 0 1 0 0 0 0 1
1 1 16.28 43 3200520 1118767 1096 TED2000s Monday 6 2006 Friday 2 2006 27 233 4 1 1 1 1 0 0 0 0 0 0
2 1 21.43 26 1636292 1846195 915 TED2000s Monday 6 2006 Thurday 2 2006 16 202 3 1 1 0 0 0 0 1 0 0 0
3 1 18.60 35 1697550 776189 748 TED2000s Monday 6 2006 Saturday 2 2006 19 213 2 1 0 0 1 0 1 0 0 0 0
4 1 19.83 48 12005869 1907337 943 TED2000s Tuesday 6 2006 Tuesday 2 2006 31 172 9 1 0 0 1 0 1 0 1 0 0
In [30]:
import pandas_profiling
from pandas_profiling import ProfileReport
profile = ProfileReport(df_copy, title='Pandas Profiling Report', html={'style':{'full_width':True}})
profile
Out[30]:


II - Split Data

Due to the nature of the dataset, we decided to use a time based cross validation approach. The data itself is not time-series but still has a time dimension aspect (the date of the Ted Talk itself) which can be quite insightful for the current model we are building.

With the evolving popularity, environment, and subjects of TedTalks, it is best to use time-based splitting to build statistically robust models and follow up with time based cross validations to evaluate the performance of our final model.

The basis of the splits for the train and test data will be the year of release, with consideration with the amount of talks in each year to ensure a 80% and 20% split for training and test, respectively.

In [31]:
df_copy.year.value_counts().sort_index()
Out[31]:
2006     50
2007    121
2008    162
2009    205
2010    216
2011    248
2012    303
2013    243
2014    240
2015    215
2016    239
2017    197
Name: year, dtype: int64

First split: Training data for model building and cross validation and Test or "New data" -- data only used for final model evaluation

  • Total number of observations: 2439
  • Train (80%) = roughly 1952 observations
  • Test (20%) = roughly 486 observations

Thus, based on the number of observations from each year, the first split will be

  • Train : Ted Talks from 2006 until 2015 (2003 observations)
  • Test: Ted Talks from 2016 and 2017 (436 observations)

Second split: Data for model building and cross validation

  • Total number of observations of the train set : 2003 observations
  • Train 2 set split (80%) : Ted talks from 2006 - 2013 (1548 observations)
  • Test 2 set split (20%) : Ted talks from 2014 - 2015 (455 observations)

image.png

First split:

We perform the first split now, and do not touch the test data until the end. We perform our second split of the training dataset right before modeling.

In [32]:
#Train data that will further be split into train and test for model building and evaluation
train = df_copy[(df_copy["year"] >= 2006) & (df_copy["year"] <= 2015)]

# Test data ("new data") -- will be set aside and only be used for final model evaluation
test = df_copy[(df_copy["year"]) >= 2016]  

III - Exploratory Data Analysis (EDA)

Lets perform some basic data analysis on train.

Preliminary visualization

In [33]:
train.columns
Out[33]:
Index(['num_speaker', 'duration', 'languages', 'views', 'related_views',
       'related_duration', 'event_category', 'day', 'month', 'year',
       'day_film', 'month_film', 'year_film', 'title_len', 'description_len',
       'speaker_frequency', 'repeat_speaker', 'Technology/Science', 'Humanity',
       'Global Issues', 'Art/Creativity', 'Business', 'Entertainment',
       'Health', 'Communication', 'Education'],
      dtype='object')
In [34]:
# Plot some histograms on the training dataset 

train.hist(bins=50, figsize=(20,15))
save_fig("EDA_histograms")
plt.show()
Saving figure EDA_histograms
In [35]:
# Lets check for correlation

import seaborn as sns
from seaborn import pairplot
sns.set_style("whitegrid")

corr_continuous = train[['duration', 'languages','views','related_views','related_duration', 'title_len', 'description_len']]

plt.figure(figsize=(14,8))
sns.heatmap(corr_continuous.corr(), annot = True, cmap="Blues", linewidths=.1)
save_fig("correlation_matrix")
plt.show()

display(corr_continuous.corr())
Saving figure correlation_matrix
duration languages views related_views related_duration title_len description_len
duration 1.000000 -0.316973 0.077581 -0.003139 0.296347 0.002012 0.010990
languages -0.316973 1.000000 0.400823 0.195044 -0.094257 -0.034059 0.009188
views 0.077581 0.400823 1.000000 0.285808 -0.006356 -0.010829 0.022803
related_views -0.003139 0.195044 0.285808 1.000000 0.046758 -0.015244 -0.016942
related_duration 0.296347 -0.094257 -0.006356 0.046758 1.000000 -0.020396 -0.081839
title_len 0.002012 -0.034059 -0.010829 -0.015244 -0.020396 1.000000 0.402069
description_len 0.010990 0.009188 0.022803 -0.016942 -0.081839 0.402069 1.000000
  • While duration have very weak correlation to views, number of speakers and different languages, suggesting longer talks don't seem to gather more attention from the audience.

TED's categories: differents durations and views

Lets see how duration and views fluctuate through categories.

Duration per category

In [36]:
print("Average duration = {}".format(round(train["duration"].mean(), 2)))
print("Median duration = {}".format(round(train["duration"].median(), 2)))

plt.figure(figsize=(10,5))
ax = sns.barplot(x="duration", y="event_category", data=train.sort_values('duration', ascending=False))
ax.set_title("Average TED Talk Duration by Category", pad=10, fontdict={'fontsize': 20})
save_fig("EDA_duration_by_category")

plt.show()
Average duration = 13.6
Median duration = 14.38
Saving figure EDA_duration_by_category
  • Knowing that on an average most TED talks are usually about 18 mins long, average of 13.6 mins and median of 14.1 mins for the entore dataset seems a little lower.

  • TEDOther seems to be the most popular type of event. Infact most of the top duration talks aren't TED talks as seen from the event column. These talks were just hosted on ted.com website.

  • Having talks that over an hour are uncommon.

Views per category

In [37]:
print("Average views = {}".format(round(train["views"].mean(), 2)))
print("Standard Deviation (views) = {}".format(round(train["views"].std(), 2)))

print("Median views = {}".format(round(train["views"].median(), 2)))

plt.figure(figsize=(10,5))
ax = sns.barplot(x="views", y="event_category", data=train.sort_values('views', ascending=False))
ax.set_title("Average TED Talk Views by Category", pad=10, fontdict={'fontsize': 20})
save_fig("EDA_views_by_category")

plt.show()
Average views = 1788792.0
Standard Deviation (views) = 2713736.71
Median views = 1135382.0
Saving figure EDA_views_by_category
  • The most views came from TEDOther events, with almost 20 million views, which is huge when compared to average views of about 1.7 million, and a median of about 1.12 million.
  • At second place is TEDx with more than 12.5 million views.

Talks over years and months

How are the talks distributed throughout time since its creation?

Years

In [38]:
talk_years = train['year'].value_counts().reset_index()
talk_years.columns = ["year", "no_of_talks"]

plt.figure(figsize=(18,5))
ax = sns.barplot(x="year", y="no_of_talks", data=talk_years)
ax.set_title("TED Talks by Release Year", pad=10, fontdict={'fontsize': 20})
save_fig("EDA_talks_by_year")

plt.show()
Saving figure EDA_talks_by_year
  • Number of talks increased consistently from 2006 to 2012 and then stabilized.

Months

In [39]:
talk_years = train['month'].value_counts().reset_index()
talk_years.columns = ["month", "no_of_talks"]

plt.figure(figsize=(18,5))
ax = sns.barplot(x="month", y="no_of_talks", data=talk_years)
ax.set_title("TED Talks by Release Month", pad=10, fontdict={'fontsize': 20})
save_fig("EDA_talks_by_month")

plt.show()

#1 = january ; 12 = December
Saving figure EDA_talks_by_month

Other Attributes vs. Target (Views)

The most popular TED talks:

  • featured speakers that had spoken at a previous TED talk
  • were on business topics
  • were posted on Friday
  • were around 20 minutes
  • were translated into many languages
  • had medium description lengths
In [40]:
plt.figure(figsize=(18,5))
ax = sns.barplot(x="repeat_speaker", y="views", data=train)
ax.set_title("TED Talks by Repeat Speaker", pad=10, fontdict={'fontsize': 20})
save_fig("EDA_talks_by_repeat_speaker")

plt.show()
Saving figure EDA_talks_by_repeat_speaker
In [41]:
plt.figure(figsize=(18,5))
ax = sns.barplot(x="Business", y="views", data=train)
ax.set_title("TED Talks by Business or Not Business Category", pad=10, fontdict={'fontsize': 20})
save_fig("EDA_talks_by_business_cat")

plt.show()
Saving figure EDA_talks_by_business_cat
In [42]:
plt.figure(figsize=(18,5))
ax = sns.barplot(x="day", y="views", data=train)
ax.set_title("TED Talk Views by Day of Week", pad=10, fontdict={'fontsize': 20})
save_fig("EDA_talks_by_dayof_week")

plt.show()
Saving figure EDA_talks_by_dayof_week
In [43]:
sns.relplot(x ='duration', y = 'views', data = train)
save_fig("EDA_duration_views")

sns.relplot(x ='languages', y = 'views', data = train)
save_fig("EDA_languages_views")

sns.relplot(x ='description_len', y = 'views', data = train)
save_fig("EDA_descriptionlen_views")
Saving figure EDA_duration_views
Saving figure EDA_languages_views
Saving figure EDA_descriptionlen_views

Outlier Detection

We identified 51 outlier in this instance.

In [44]:
#use isolation forest to examine most extreme 2.5% of outliers from continuous variables from the analysis, which were visually identified during data exploration
continuous_vars = train[['duration', 'languages','views','related_views','related_duration']]

from sklearn.ensemble import IsolationForest
iforest = IsolationForest(n_estimators = 100,random_state=13, contamination=0.025) 

#predict anomalies
pred = iforest.fit_predict(continuous_vars)
#anomaly score of each 
iforest_scores = iforest.decision_function(continuous_vars)

#create a new variable to store the index number where anomaly =-1 in the anomalies vector
from numpy import where
anomaly_index = where(pred ==-1)
#extract values corresponding to the index of anomalies from the main df
anomaly_values = train.iloc[anomaly_index]
anomaly_values.head(3)
Out[44]:
num_speaker duration languages views related_views related_duration event_category day month year day_film month_film year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment Health Communication Education
0 1 19.40 60 47227110 3027062 921 TED2000s Monday 6 2006 Friday 2 2006 27 149 3 1 0 1 0 1 0 0 0 0 1
5 1 21.75 36 20685401 8376548 851 TED2000s Tuesday 6 2006 Wednesday 2 2006 20 122 1 0 0 1 0 0 1 1 0 0 0
47 1 3.50 66 10841210 11888126 894 TED2000s Wednesday 12 2006 Tuesday 2 2005 20 209 2 1 0 1 0 0 1 1 0 0 0

Further Ted Talks exploration: Cluster Visualization

To better understand the dataset, we tried to identify some groups of TED talks that share similar characteristics.

In [45]:
#make a copy of this dataset for clustering at the end
train_clust = train.copy()

train_cluster = train_clust[['duration','views', 'languages']]

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
train_clust_std = scaler.fit_transform(train_cluster)
In [46]:
#Importing required modules
 
from sklearn.datasets import load_digits
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
import numpy as np

#Transform the data
pca = PCA(2)
pca_cluster = pca.fit_transform(train_clust_std)
In [47]:
#Elbow Method
from sklearn.cluster import KMeans
withinss = []
for i in range(2,11):
    kmeans = KMeans(n_clusters=i)
    model_clustering = kmeans.fit(train_clust_std)
    withinss.append(model_clustering.inertia_) 
    print('clusters: ', i, ', inertia: ', model_clustering.inertia_)
    
from matplotlib import pyplot
pyplot.plot([2,3,4,5,6,7,8,9,10],withinss)    
#the optimal number of clusters is 4 
clusters:  2 , inertia:  4224.2781196474425
clusters:  3 , inertia:  3035.0520047217165
clusters:  4 , inertia:  2457.852000379606
clusters:  5 , inertia:  2070.0018406282043
clusters:  6 , inertia:  1736.927166718721
clusters:  7 , inertia:  1444.4337064340314
clusters:  8 , inertia:  1269.278979374476
clusters:  9 , inertia:  1138.1888128766882
clusters:  10 , inertia:  1015.2805558637834
Out[47]:
[<matplotlib.lines.Line2D at 0x1f20db7c088>]
In [48]:
from sklearn.cluster import KMeans    
kmeans = KMeans(n_clusters=4, random_state = 42)    
model_clustering = kmeans.fit(pca_cluster)
labels = model_clustering.predict(pca_cluster)    

from sklearn.metrics import silhouette_samples
from sklearn.metrics import silhouette_score
silhouette_score(pca_cluster,labels)
silhouette=silhouette_samples(pca_cluster,labels)
np.average(silhouette)

import matplotlib.pyplot as plt

#Getting the Centroids
centroids = kmeans.cluster_centers_
#Getting unique labels
u_labels = np.unique(labels)
 
#plotting the results:
 
for i in u_labels:
    plt.scatter(pca_cluster[labels == i , 0] , pca_cluster[labels == i , 1] , label = i)
plt.scatter(centroids[:,0] , centroids[:,1] , s = 10,color="black")
plt.legend()
save_fig("clusters")

plt.show()
Saving figure clusters
In [49]:
#number of observations in each cluster 
from collections import Counter
Counter(labels)
Out[49]:
Counter({2: 53, 3: 846, 1: 592, 0: 512})
In [50]:
#group numbers into percentiles for easy visualization
cols = ['duration', 'views', 'languages']
new_cols = ['duration_p','views_p', 'languages_p']
for i in new_cols:
    train_cluster[i] = 0
for k in range(len(cols)):
    percentile = np.percentile(train_cluster[cols[k]], [25,50,75])
    for i in range(len(train_cluster)):
        if train_cluster[cols[k]][i] <= percentile[0]:
            train_cluster[new_cols[k]][i] = '0-25'
        elif train_cluster[cols[k]][i] <= percentile[1]:
            train_cluster[new_cols[k]][i] = '26-50'
        elif train_cluster[cols[k]][i] <= percentile[2]:
            train_cluster[new_cols[k]][i] = '51-75'
        else:
            train_cluster[new_cols[k]][i] = '76-100'
In [51]:
#rename cluster labels
train_cluster.loc[:,'cluster_labels'] = labels
train_cluster.loc[:, 'event_category'] = train_clust['event_category']


train_cluster.loc[train_cluster['cluster_labels'] ==0, 'cluster_labels'] = 'Cluster 1'
train_cluster.loc[train_cluster['cluster_labels'] ==1, 'cluster_labels'] = 'Cluster 2'
train_cluster.loc[train_cluster['cluster_labels'] ==2, 'cluster_labels'] = 'Cluster 3'
train_cluster.loc[train_cluster['cluster_labels'] ==3, 'cluster_labels'] = 'Cluster 4'
train_cluster.loc[train_cluster['cluster_labels'] ==4, 'cluster_labels'] = 'Cluster 5'
In [52]:
train_cluster.head(3)
Out[52]:
duration views languages duration_p views_p languages_p cluster_labels event_category
0 19.40 47227110 60 76-100 76-100 76-100 Cluster 3 TED2000s
1 16.28 3200520 43 51-75 76-100 76-100 Cluster 4 TED2000s
2 21.43 1636292 26 76-100 51-75 26-50 Cluster 2 TED2000s
In [53]:
import plotly.express as px

#CLUSTER DISTRIBUTION FOR DURATION
fig = px.histogram(train_cluster, x = 'cluster_labels', color= 'duration_p',
                   labels=dict(goal_usd1 = 'Percentile'), 
                   category_orders={'duration_p': ['0-25', '26-50', '51-75', '75-100']}).update_xaxes(categoryorder='category ascending')

fig.update_layout(title_text='Clusters by Talk Duration (relative percentiles)', title_x=0.5)
save_fig("clusters_by_duration")

fig.show()
Saving figure clusters_by_duration
<Figure size 432x288 with 0 Axes>
In [54]:
import plotly.express as px

#CLUSTER DISTRIBUTION FOR LANGUAGES 
fig = px.histogram(train_cluster, x = 'cluster_labels', color= 'languages_p',
                   labels=dict(goal_usd1 = 'Percentile'), 
                   category_orders={'languages_p': ['0-25', '26-50', '51-75', '75-100']}).update_xaxes(categoryorder='category ascending')

fig.update_layout(title_text='Clusters by Num of Languages (relative percentiles)', title_x=0.5)
save_fig("clusters_by_languages")

fig.show()
Saving figure clusters_by_languages
<Figure size 432x288 with 0 Axes>
In [55]:
import plotly.express as px

#CLUSTER DISTRIBUTION FOR VIEWS
fig = px.histogram(train_cluster, x = 'cluster_labels', color= 'views_p',
                   labels=dict(goal_usd1 = 'Percentile'), 
                   category_orders={'views_p': ['0-25', '26-50', '51-75', '75-100']}).update_xaxes(categoryorder='category ascending')

fig.update_layout(title_text='Clusters by Views (relative percentile)', title_x=0.5)
save_fig("clusters_by_views")

fig.show()
Saving figure clusters_by_views
<Figure size 432x288 with 0 Axes>
In [56]:
import plotly.express as px

#CLUSTER DISTRIBUTION FOR VIEWS
fig = px.histogram(train_cluster, x = 'cluster_labels', color= 'event_category',
                   labels=dict(goal_usd1 = 'Percentile'), 
                   category_orders={'views_p': ['0-25', '26-50', '51-75', '75-100']}).update_xaxes(categoryorder='category ascending')

fig.update_layout(title_text='Clusters by Event Category (relative percentile)', title_x=0.5)
save_fig("clusters_by_event_cat")

fig.show()
Saving figure clusters_by_event_cat
<Figure size 432x288 with 0 Axes>

IV - Prepare data for modeling

Drop useless and invalid predictors

We first drop columns related to recommended videos as we assume that this will not be generated until the video has been posted online. We also assume that we cannot use languages as TED uses a volunteer based translation service where viewers can translate their favourite talks: https://www.ted.com/about/programs-initiatives/ted-translators. We assume this would not occur until after the video is posted also.

In [57]:
train.drop(['related_views','related_duration', 'languages'], axis = 1, inplace = True)
In [58]:
train
Out[58]:
num_speaker duration views event_category day month year day_film month_film year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment Health Communication Education
0 1 19.40 47227110 TED2000s Monday 6 2006 Friday 2 2006 27 149 3 1 0 1 0 1 0 0 0 0 1
1 1 16.28 3200520 TED2000s Monday 6 2006 Friday 2 2006 27 233 4 1 1 1 1 0 0 0 0 0 0
2 1 21.43 1636292 TED2000s Monday 6 2006 Thurday 2 2006 16 202 3 1 1 0 0 0 0 1 0 0 0
3 1 18.60 1697550 TED2000s Monday 6 2006 Saturday 2 2006 19 213 2 1 0 0 1 0 1 0 0 0 0
4 1 19.83 12005869 TED2000s Tuesday 6 2006 Tuesday 2 2006 31 172 9 1 0 0 1 0 1 0 1 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1998 1 8.82 1453242 TEDx Wednesday 12 2015 Wednesday 11 2014 54 457 2 1 0 1 0 0 0 0 0 0 0
1999 2 15.73 2269844 TEDWomen Thurday 12 2015 Tuesday 5 2015 53 419 1 0 0 1 1 1 0 0 0 0 0
2000 1 19.90 1117165 TEDGlobal Friday 12 2015 Monday 12 2015 39 542 1 0 0 0 0 0 0 1 0 0 0
2001 1 9.47 1254964 TEDGlobal Monday 12 2015 Monday 6 2015 59 547 1 0 0 0 0 0 0 0 0 0 0
2002 1 12.77 16601927 TEDx Wednesday 12 2015 Friday 11 2015 67 476 1 0 0 0 0 1 0 0 0 0 0

2003 rows × 23 columns

Categorical Encoding

In [59]:
train_copy = pd.get_dummies(data = train, columns = ["event_category","day","month","day_film","month_film"])
In [60]:
train_crossval = train_copy.copy()
#this is the dataset that will be used for cross validation of the final models
In [61]:
# training data for the different models that will be tested
train2 = train[(train["year"] >= 2006) & (train["year"] <= 2013)]

# test data that will be use to evaluate all tested models
test2 = train[(train["year"]) >= 2014]

Second split : Data for model building and cross validation

In [62]:
y_train2 = train2['views']
X_train2 = train2.drop('views', axis = 1)

y_test2 = test2['views']
X_test2 = test2.drop('views', axis = 1)
In [63]:
y_test = test['views']
X_test = test.drop('views', axis = 1)

Preprocessing Pipeline

In [64]:
#Extracting Column names

X_train_copy = X_train2[["event_category","day","month","day_film","month_film"]]
X_train_copy_cat = pd.get_dummies(data = X_train_copy, columns = ["event_category","day","month","day_film","month_film"])
X_train_copy_num = X_train2.drop(["event_category","day","month","day_film","month_film"],axis=1)

num_cols=list(X_train_copy_num.columns)
cat_cols=list(X_train_copy_cat.columns)

col = cat_cols+num_cols
In [65]:
#Pipeline starts here

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder

train_num = X_train2.drop(["event_category","day","month","day_film","month_film"], axis=1)
test_num = X_test.drop(["event_category","day","month","day_film","month_film"], axis=1)

num_pipeline = Pipeline([
        ('std_scaler', StandardScaler()),
    ])
train_num_tr = num_pipeline.fit_transform(train_num)
test_num_tr = num_pipeline.fit_transform(test_num)

from sklearn.compose import ColumnTransformer

num_attribs_train = list(X_train2.drop(["event_category","day","month","day_film","month_film"], axis=1))
num_attribs_test = list(X_test.drop(["event_category","day","month","day_film","month_film"], axis=1))

cat_attribs = ["event_category","day","month","day_film","month_film"]

full_pipeline = ColumnTransformer([
        ("cat", OneHotEncoder(), cat_attribs),
        ("num", num_pipeline, num_attribs_train),
    ])

train_prepared = full_pipeline.fit_transform(X_train2)
test_prepared2 = full_pipeline.fit_transform(X_test2)
test_prepared = full_pipeline.fit_transform(X_test)

#Convert array to dataframe for feature selection

#test and test2 datasets do not have the categories event_category_TED1900s, day_Saturday and day_Sunday.
#Hence, we add a dummy array to retain shape across the train and test datasets

a = np.zeros((455,))
b = np.zeros((436,))

test_preped2 = np.insert(test_prepared2, 0, a, axis=1)
test_preped2 = np.insert(test_preped2, 12, a, axis=1)
test_preped2 = np.insert(test_preped2, 13, a, axis=1)

test_preped = np.insert(test_prepared, 0, b, axis=1)
test_preped = np.insert(test_preped, 12, b, axis=1)
test_preped = np.insert(test_preped, 13, b, axis=1)

X_train2 = pd.DataFrame(train_prepared, columns=col)
X_test = pd.DataFrame(test_preped, columns=col)
X_test2 = pd.DataFrame(test_preped2, columns=col)
In [66]:
X_train2.head()
Out[66]:
event_category_TED1900s event_category_TED2000s event_category_TED@ event_category_TED@BCG event_category_TEDGlobal event_category_TEDMED event_category_TEDOther event_category_TEDSalon event_category_TEDWomen event_category_TEDx day_Friday day_Monday day_Saturday day_Sunday day_Thurday day_Tuesday day_Wednesday month_1 month_2 month_3 month_4 month_5 month_6 month_7 month_8 month_9 month_10 month_11 month_12 day_film_Friday day_film_Monday day_film_Saturday day_film_Sunday day_film_Thurday day_film_Tuesday day_film_Wednesday month_film_1 month_film_2 month_film_3 month_film_4 month_film_5 month_film_6 month_film_7 month_film_8 month_film_9 month_film_10 month_film_11 month_film_12 num_speaker duration year year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment Health Communication Education
0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.138482 0.959610 -2.151435 -1.226219 -0.417359 -1.550099 1.294177 1.514469 -0.910927 1.496160 -0.647211 1.420403 -0.465812 -0.461699 -0.372925 -0.289886 3.110605
1 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.138482 0.423795 -2.151435 -1.226219 -0.417359 -0.469920 2.185760 1.514469 1.097783 1.496160 1.545092 -0.704026 -0.465812 -0.461699 -0.372925 -0.289886 -0.321481
2 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.138482 1.308234 -2.151435 -1.226219 -1.595539 -0.868558 1.294177 1.514469 1.097783 -0.668378 -0.647211 -0.704026 -0.465812 2.165912 -0.372925 -0.289886 -0.321481
3 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.138482 0.822222 -2.151435 -1.226219 -1.274217 -0.727106 0.402595 1.514469 -0.910927 -0.668378 1.545092 -0.704026 2.146787 -0.461699 -0.372925 -0.289886 -0.321481
4 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.138482 1.033457 -2.151435 -1.226219 0.011071 -1.254336 6.643674 1.514469 -0.910927 -0.668378 1.545092 -0.704026 2.146787 -0.461699 2.681506 -0.289886 -0.321481
In [67]:
X_test2.head()
Out[67]:
event_category_TED1900s event_category_TED2000s event_category_TED@ event_category_TED@BCG event_category_TEDGlobal event_category_TEDMED event_category_TEDOther event_category_TEDSalon event_category_TEDWomen event_category_TEDx day_Friday day_Monday day_Saturday day_Sunday day_Thurday day_Tuesday day_Wednesday month_1 month_2 month_3 month_4 month_5 month_6 month_7 month_8 month_9 month_10 month_11 month_12 day_film_Friday day_film_Monday day_film_Saturday day_film_Sunday day_film_Thurday day_film_Tuesday day_film_Wednesday month_film_1 month_film_2 month_film_3 month_film_4 month_film_5 month_film_6 month_film_7 month_film_8 month_film_9 month_film_10 month_film_11 month_film_12 num_speaker duration year year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment Health Communication Education
0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.119443 0.475647 -0.946485 -1.519912 -0.350389 -0.274149 -0.425585 -0.513701 -0.625650 2.042169 -0.547723 -0.588348 -0.366965 -0.284183 -0.389742 -0.208753 -0.23074
1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 -0.119443 0.268987 -0.946485 -1.519912 -0.350389 -0.142344 -0.425585 -0.513701 1.598339 -0.489675 1.825742 -0.588348 -0.366965 -0.284183 -0.389742 -0.208753 -0.23074
2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 -0.119443 0.264974 -0.946485 -2.865359 -1.515790 -0.608729 -0.425585 -0.513701 -0.625650 -0.489675 -0.547723 1.699673 -0.366965 -0.284183 -0.389742 -0.208753 -0.23074
3 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 -0.119443 -0.346981 -0.946485 -1.519912 -0.583469 0.354457 -0.425585 -0.513701 1.598339 2.042169 -0.547723 1.699673 -0.366965 -0.284183 -0.389742 -0.208753 -0.23074
4 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.119443 -0.035988 -0.946485 -1.519912 -0.894243 0.334179 -0.425585 -0.513701 1.598339 -0.489675 -0.547723 1.699673 -0.366965 3.518857 -0.389742 -0.208753 -0.23074
In [68]:
X_test.head()
Out[68]:
event_category_TED1900s event_category_TED2000s event_category_TED@ event_category_TED@BCG event_category_TEDGlobal event_category_TEDMED event_category_TEDOther event_category_TEDSalon event_category_TEDWomen event_category_TEDx day_Friday day_Monday day_Saturday day_Sunday day_Thurday day_Tuesday day_Wednesday month_1 month_2 month_3 month_4 month_5 month_6 month_7 month_8 month_9 month_10 month_11 month_12 day_film_Friday day_film_Monday day_film_Saturday day_film_Sunday day_film_Thurday day_film_Tuesday day_film_Wednesday month_film_1 month_film_2 month_film_3 month_film_4 month_film_5 month_film_6 month_film_7 month_film_8 month_film_9 month_film_10 month_film_11 month_film_12 num_speaker duration year year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment Health Communication Education
0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 -0.177069 0.169081 -0.907892 -1.334882 -0.728010 1.350185 -0.343770 -0.429863 -0.933405 -1.091309 -0.807165 -0.895351 -0.418572 -0.3 2.267343 -0.738985 -0.448444
1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.177069 -0.143956 -0.907892 -1.334882 -0.539243 -0.546616 2.778811 2.326320 1.071346 -1.091309 -0.807165 -0.895351 -0.418572 -0.3 -0.441045 -0.738985 -0.448444
2 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.177069 -1.576153 -0.907892 -1.334882 -0.633627 1.039069 -0.343770 -0.429863 -0.933405 -1.091309 1.238904 -0.895351 2.389078 -0.3 -0.441045 -0.738985 -0.448444
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 -0.177069 0.101563 -0.907892 -1.334882 -0.067324 0.818278 -0.343770 -0.429863 -0.933405 -1.091309 1.238904 -0.895351 -0.418572 -0.3 -0.441045 -0.738985 -0.448444
4 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 -0.177069 -0.680007 -0.907892 -1.334882 0.593362 -0.355932 1.217520 2.326320 1.071346 0.916331 -0.807165 -0.895351 -0.418572 -0.3 2.267343 -0.738985 -0.448444

Feature selection

We now perform recursive feature elimination using random forest, and select top 35 features.

In [69]:
from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(random_state=42)
model = RFE(rf, n_features_to_select=50)
fit_model = model.fit(X_train2, y_train2)
features = pd.DataFrame(list(zip(X_train2.columns,fit_model.ranking_)), columns = ['predictor','ranking'])
In [70]:
chosen_features = features.sort_values(by = 'ranking').head(35)['predictor'].tolist()
chosen_features
Out[70]:
['day_film_Sunday',
 'day_film_Saturday',
 'Communication',
 'day_film_Thurday',
 'day_film_Tuesday',
 'day_film_Wednesday',
 'month_film_2',
 'month_film_5',
 'month_film_6',
 'month_film_7',
 'month_film_9',
 'month_film_10',
 'num_speaker',
 'duration',
 'year',
 'year_film',
 'title_len',
 'description_len',
 'speaker_frequency',
 'repeat_speaker',
 'Technology/Science',
 'Humanity',
 'Global Issues',
 'Art/Creativity',
 'Business',
 'Entertainment',
 'day_film_Monday',
 'day_film_Friday',
 'Education',
 'month_11',
 'day_Sunday',
 'day_Thurday',
 'event_category_TEDx',
 'month_12',
 'day_Tuesday']
In [71]:
X_train2 = X_train2[chosen_features]
X_test2 = X_test2[chosen_features]
In [72]:
X_test=X_test[chosen_features]
In [73]:
X_train2.shape
Out[73]:
(1548, 35)
In [74]:
X_test2.shape
Out[74]:
(455, 35)
In [75]:
X_train2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1548 entries, 0 to 1547
Data columns (total 35 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   day_film_Sunday      1548 non-null   float64
 1   day_film_Saturday    1548 non-null   float64
 2   Communication        1548 non-null   float64
 3   day_film_Thurday     1548 non-null   float64
 4   day_film_Tuesday     1548 non-null   float64
 5   day_film_Wednesday   1548 non-null   float64
 6   month_film_2         1548 non-null   float64
 7   month_film_5         1548 non-null   float64
 8   month_film_6         1548 non-null   float64
 9   month_film_7         1548 non-null   float64
 10  month_film_9         1548 non-null   float64
 11  month_film_10        1548 non-null   float64
 12  num_speaker          1548 non-null   float64
 13  duration             1548 non-null   float64
 14  year                 1548 non-null   float64
 15  year_film            1548 non-null   float64
 16  title_len            1548 non-null   float64
 17  description_len      1548 non-null   float64
 18  speaker_frequency    1548 non-null   float64
 19  repeat_speaker       1548 non-null   float64
 20  Technology/Science   1548 non-null   float64
 21  Humanity             1548 non-null   float64
 22  Global Issues        1548 non-null   float64
 23  Art/Creativity       1548 non-null   float64
 24  Business             1548 non-null   float64
 25  Entertainment        1548 non-null   float64
 26  day_film_Monday      1548 non-null   float64
 27  day_film_Friday      1548 non-null   float64
 28  Education            1548 non-null   float64
 29  month_11             1548 non-null   float64
 30  day_Sunday           1548 non-null   float64
 31  day_Thurday          1548 non-null   float64
 32  event_category_TEDx  1548 non-null   float64
 33  month_12             1548 non-null   float64
 34  day_Tuesday          1548 non-null   float64
dtypes: float64(35)
memory usage: 423.4 KB

Scale Features

In [76]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train2)
X_train2 = pd.DataFrame(X_train_scaled)

X_test_scaled = scaler.fit_transform(X_test2)
X_test2 = pd.DataFrame(X_test_scaled)

V - Select and train a model

In [77]:
#Import models' packages
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error, mean_absolute_error


#Regression Models
regression_models = [['RF regressor',RandomForestRegressor()], 
                     ['GBT',GradientBoostingRegressor()],
                     ['LASSO', Lasso()],
                     ['Ridge', Ridge()]]
                     

#storing MSEs and MAEs for visualization
Model_MSEs = []
Model_MAEs = []
Models = ['Random Forest Regressor', 'Gradient Boosting Regressor', 'Lasso', 'Ridge']

for name, model in regression_models: 
    print('Model ', name, "performance: ")
    model.fit(X_train2, y_train2.values.ravel())
    y_test_pred = model.predict(X_test2)
    
    MSE = mean_squared_error(y_test2, y_test_pred)
    MAE = mean_absolute_error(y_test2, y_test_pred)
    Model_MSEs.append(MSE)
    Model_MAEs.append(MAE)

    print("MSE: ", MSE)
    print("MAE: ", MAE)
    print("-" * 19)
Model  RF regressor performance: 
MSE:  3729015820318.8193
MAE:  1037114.5604395604
-------------------
Model  GBT performance: 
MSE:  4088391912643.5757
MAE:  1013861.8161276642
-------------------
Model  LASSO performance: 
MSE:  3250046319628.9155
MAE:  981656.8229258836
-------------------
Model  Ridge performance: 
MSE:  3246170992333.408
MAE:  980253.8586267291
-------------------
C:\Users\Sophie\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:531: ConvergenceWarning:

Objective did not converge. You might want to increase the number of iterations. Duality gap: 93653358675172.0, tolerance: 1337178816345.5298

In [78]:
y_test2.mean()
Out[78]:
1888806.4175824176
In [79]:
len(Model_MSEs)
Out[79]:
4

Simple visualization of the initial results - MSE

In [80]:
d = {'Model' : Models, 'MSE': Model_MSEs}
MSE_vis = pd.DataFrame(d)

fig_dims = (15, 4)
fig, ax = plt.subplots(figsize=fig_dims)
ax = sns.barplot(x="Model", y="MSE", data=d, ax = ax)
ax.set_title("MSE of different tested models", pad=10, fontdict={'fontsize': 20})
ax.set_xlabel("Regression models",fontsize=20)
save_fig("MSE_initialmodels")

plt.show()
Saving figure MSE_initialmodels

Simple visualization of the initial results - MAE

In [81]:
df_MAE = {'Model' : Models, 'MAE': Model_MAEs}

fig_dims = (15, 4)
fig, ax = plt.subplots(figsize=fig_dims)
ax = sns.barplot(x="Model", y="MAE", data=df_MAE, ax = ax)
ax.set_title("MAE of different tested models", pad=10, fontdict={'fontsize': 20})
ax.set_xlabel("Regression models",fontsize=20)
save_fig("MAE_initialmodels")

plt.show()
Saving figure MAE_initialmodels

Testing more advanced regression models

  1. XGBoost
  2. Light GMB
  3. SVM
  4. KNN

I - XGBoost

In [82]:
import xgboost as xg 
xgb_regressor = xg.XGBRegressor(objective ='reg:linear', n_estimators = 10, seed = 123) 
  
# Fitting the model 
xgb_regressor.fit(X_train2, y_train2.values.ravel()) 
  
# Predict the model 
y_test_pred = xgb_regressor.predict(X_test2) 
  
# MSE Computation
xgb_MSE = mean_squared_error(y_test2, y_test_pred)
print('The MSE of the XGBoost model is: ', xgb_MSE)

# MAE Computation
xgb_MAE = mean_absolute_error(y_test2, y_test_pred)
print('The MAE of the XGBoost model is: ', xgb_MAE)

# RMSE Computation
from sklearn.metrics import mean_squared_log_error
print('The RMSLE of prediction for XGBoost is:', round(mean_squared_log_error(y_test2, y_test_pred) ** 0.5, 5))
[13:37:44] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.3.0/src/objective/regression_obj.cu:170: reg:linear is now deprecated in favor of reg:squarederror.
The MSE of the XGBoost model is:  3539151989250.334
The MAE of the XGBoost model is:  919899.306456044
The RMSLE of prediction for XGBoost is: 0.59164
In [83]:
# Parameter testing for base learner
# Train and test set are converted to DMatrix objects, as it is required by learning API. 
train_dmatrix = xg.DMatrix(data = X_train2, label = y_train2) 
test_dmatrix = xg.DMatrix(data = X_test2, label = y_test2) 
  
# Parameter dictionary specifying base learner 
param = {"booster":"gblinear", "objective":"reg:linear"} 
  
xgb_r = xg.train(params = param, dtrain = train_dmatrix, num_boost_round = 10) 
pred = xgb_r.predict(test_dmatrix) 

# MSE Computation
xgb_MSE2 = mean_squared_error(y_test2, pred)
print('The MSE of the XGBoost model is: ', xgb_MSE2)

# MAE Computation
xgb_MAE2 = mean_absolute_error(y_test2, pred)
print('The MAE of the XGBoost model is: ', xgb_MAE2)

# RMSE Computation
#print('The RMSLE of prediction is:', round(mean_squared_log_error(y_test2, pred) ** 0.5, 5))


## Storing the new MSEs and MAEs in original list
Model_MSEs.append(xgb_MSE2)
Model_MAEs.append(xgb_MAE2)
[13:37:44] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.3.0/src/objective/regression_obj.cu:170: reg:linear is now deprecated in favor of reg:squarederror.
The MSE of the XGBoost model is:  3155399375930.833
The MAE of the XGBoost model is:  944116.9487122253

We can see an improvement in the MSE with the use of the parameter dictionnary

II - LightGBM

In [84]:
import lightgbm as lgb

hyper_params = {
    'task': 'train',
    'boosting_type': 'gbdt',
    'objective': 'regression',
    'metric': ['l2', 'auc'],
    'learning_rate': 0.005,
    'feature_fraction': 0.9,
    'bagging_fraction': 0.7,
    'bagging_freq': 10,
    'verbose': 0,
    "max_depth": 8,
    "num_leaves": 128,  
    "max_bin": 512,
    "num_iterations": 100000,
    "n_estimators": 1000
}

gbm = lgb.LGBMRegressor(**hyper_params)

gbm.fit(X_train2, y_train2,
        eval_set=[(X_test2, y_test2)],
        eval_metric='l1',
        early_stopping_rounds=1000)
[LightGBM] [Warning] feature_fraction is set=0.9, colsample_bytree=1.0 will be ignored. Current value: feature_fraction=0.9
[LightGBM] [Warning] bagging_fraction is set=0.7, subsample=1.0 will be ignored. Current value: bagging_fraction=0.7
[LightGBM] [Warning] bagging_freq is set=10, subsample_freq=0 will be ignored. Current value: bagging_freq=10
[LightGBM] [Warning] feature_fraction is set=0.9, colsample_bytree=1.0 will be ignored. Current value: feature_fraction=0.9
[LightGBM] [Warning] bagging_fraction is set=0.7, subsample=1.0 will be ignored. Current value: bagging_fraction=0.7
[LightGBM] [Warning] bagging_freq is set=10, subsample_freq=0 will be ignored. Current value: bagging_freq=10
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000666 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Warning] feature_fraction is set=0.9, colsample_bytree=1.0 will be ignored. Current value: feature_fraction=0.9
[LightGBM] [Warning] bagging_fraction is set=0.7, subsample=1.0 will be ignored. Current value: bagging_fraction=0.7
[LightGBM] [Warning] bagging_freq is set=10, subsample_freq=0 will be ignored. Current value: bagging_freq=10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[1]	valid_0's l1: 867844	valid_0's l2: 3.0181e+12	valid_0's auc: 1
Training until validation scores don't improve for 1000 rounds
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[2]	valid_0's l1: 867237	valid_0's l2: 3.01765e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[3]	valid_0's l1: 866843	valid_0's l2: 3.01737e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[4]	valid_0's l1: 866496	valid_0's l2: 3.01756e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[5]	valid_0's l1: 865896	valid_0's l2: 3.01712e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[6]	valid_0's l1: 865298	valid_0's l2: 3.01672e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[7]	valid_0's l1: 864715	valid_0's l2: 3.01646e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[8]	valid_0's l1: 864134	valid_0's l2: 3.01625e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[9]	valid_0's l1: 863655	valid_0's l2: 3.01724e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[10]	valid_0's l1: 863022	valid_0's l2: 3.01796e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[11]	valid_0's l1: 862665	valid_0's l2: 3.01758e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[12]	valid_0's l1: 862460	valid_0's l2: 3.0162e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[13]	valid_0's l1: 861785	valid_0's l2: 3.01495e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[14]	valid_0's l1: 861458	valid_0's l2: 3.01491e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[15]	valid_0's l1: 861173	valid_0's l2: 3.01493e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[16]	valid_0's l1: 860983	valid_0's l2: 3.01515e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[17]	valid_0's l1: 860836	valid_0's l2: 3.01491e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[18]	valid_0's l1: 860611	valid_0's l2: 3.01493e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[19]	valid_0's l1: 859936	valid_0's l2: 3.0138e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[20]	valid_0's l1: 859768	valid_0's l2: 3.01377e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[21]	valid_0's l1: 858914	valid_0's l2: 3.01282e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[22]	valid_0's l1: 858145	valid_0's l2: 3.01168e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[23]	valid_0's l1: 857904	valid_0's l2: 3.01069e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[24]	valid_0's l1: 857020	valid_0's l2: 3.00979e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[25]	valid_0's l1: 856286	valid_0's l2: 3.00951e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[26]	valid_0's l1: 855447	valid_0's l2: 3.00923e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[27]	valid_0's l1: 854680	valid_0's l2: 3.00848e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[28]	valid_0's l1: 853929	valid_0's l2: 3.00798e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[29]	valid_0's l1: 853648	valid_0's l2: 3.00657e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[30]	valid_0's l1: 853453	valid_0's l2: 3.00579e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[31]	valid_0's l1: 852645	valid_0's l2: 3.00515e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[32]	valid_0's l1: 851944	valid_0's l2: 3.00507e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[33]	valid_0's l1: 851247	valid_0's l2: 3.00505e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[34]	valid_0's l1: 851013	valid_0's l2: 3.00558e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[35]	valid_0's l1: 850435	valid_0's l2: 3.00528e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[36]	valid_0's l1: 849767	valid_0's l2: 3.00535e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[37]	valid_0's l1: 849413	valid_0's l2: 3.00451e+12	valid_0's auc: 1
C:\Users\Sophie\anaconda3\lib\site-packages\lightgbm\engine.py:151: UserWarning:

Found `num_iterations` in params. Will use it instead of argument

[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[38]	valid_0's l1: 848757	valid_0's l2: 3.00464e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[39]	valid_0's l1: 848039	valid_0's l2: 3.00516e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[40]	valid_0's l1: 847383	valid_0's l2: 3.00524e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[41]	valid_0's l1: 847422	valid_0's l2: 3.0049e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[42]	valid_0's l1: 847502	valid_0's l2: 3.00461e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[43]	valid_0's l1: 847550	valid_0's l2: 3.00442e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[44]	valid_0's l1: 847573	valid_0's l2: 3.00459e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[45]	valid_0's l1: 847798	valid_0's l2: 3.00458e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[46]	valid_0's l1: 847848	valid_0's l2: 3.00484e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[47]	valid_0's l1: 847953	valid_0's l2: 3.00478e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[48]	valid_0's l1: 848575	valid_0's l2: 3.00344e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[49]	valid_0's l1: 848705	valid_0's l2: 3.00383e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[50]	valid_0's l1: 848915	valid_0's l2: 3.00421e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[51]	valid_0's l1: 848640	valid_0's l2: 3.00429e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[52]	valid_0's l1: 848640	valid_0's l2: 3.00454e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[53]	valid_0's l1: 848433	valid_0's l2: 3.00507e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[54]	valid_0's l1: 848141	valid_0's l2: 3.00533e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[55]	valid_0's l1: 847857	valid_0's l2: 3.00561e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[56]	valid_0's l1: 847445	valid_0's l2: 3.00586e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[57]	valid_0's l1: 847187	valid_0's l2: 3.00618e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[58]	valid_0's l1: 846795	valid_0's l2: 3.00648e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[59]	valid_0's l1: 846573	valid_0's l2: 3.00685e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[60]	valid_0's l1: 846395	valid_0's l2: 3.00734e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[61]	valid_0's l1: 847020	valid_0's l2: 3.00737e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[62]	valid_0's l1: 846683	valid_0's l2: 3.00776e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[63]	valid_0's l1: 847121	valid_0's l2: 3.0078e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[64]	valid_0's l1: 847611	valid_0's l2: 3.00794e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[65]	valid_0's l1: 848141	valid_0's l2: 3.00817e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[66]	valid_0's l1: 848390	valid_0's l2: 3.00834e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[67]	valid_0's l1: 849000	valid_0's l2: 3.0086e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[68]	valid_0's l1: 849028	valid_0's l2: 3.00958e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[69]	valid_0's l1: 849563	valid_0's l2: 3.00982e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[70]	valid_0's l1: 849377	valid_0's l2: 3.00933e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[71]	valid_0's l1: 849374	valid_0's l2: 3.00944e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[72]	valid_0's l1: 849371	valid_0's l2: 3.00957e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[73]	valid_0's l1: 849368	valid_0's l2: 3.00974e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[74]	valid_0's l1: 849365	valid_0's l2: 3.00993e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[75]	valid_0's l1: 848872	valid_0's l2: 3.00932e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[76]	valid_0's l1: 848756	valid_0's l2: 3.00909e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[77]	valid_0's l1: 848760	valid_0's l2: 3.00934e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[78]	valid_0's l1: 848662	valid_0's l2: 3.00916e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[79]	valid_0's l1: 848605	valid_0's l2: 3.00973e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[80]	valid_0's l1: 848646	valid_0's l2: 3.00989e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[81]	valid_0's l1: 848764	valid_0's l2: 3.00921e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[82]	valid_0's l1: 848765	valid_0's l2: 3.00838e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[83]	valid_0's l1: 849129	valid_0's l2: 3.009e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[84]	valid_0's l1: 849049	valid_0's l2: 3.00827e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[85]	valid_0's l1: 848962	valid_0's l2: 3.00762e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[86]	valid_0's l1: 849052	valid_0's l2: 3.00678e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[87]	valid_0's l1: 849344	valid_0's l2: 3.00666e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[88]	valid_0's l1: 849508	valid_0's l2: 3.00607e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[89]	valid_0's l1: 849763	valid_0's l2: 3.00625e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[90]	valid_0's l1: 849918	valid_0's l2: 3.0057e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[91]	valid_0's l1: 849845	valid_0's l2: 3.00536e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[92]	valid_0's l1: 849524	valid_0's l2: 3.00535e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[93]	valid_0's l1: 849432	valid_0's l2: 3.00516e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[94]	valid_0's l1: 849288	valid_0's l2: 3.00496e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[95]	valid_0's l1: 849028	valid_0's l2: 3.00503e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[96]	valid_0's l1: 849116	valid_0's l2: 3.00578e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[97]	valid_0's l1: 849159	valid_0's l2: 3.0057e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[98]	valid_0's l1: 849300	valid_0's l2: 3.00627e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[99]	valid_0's l1: 849383	valid_0's l2: 3.00627e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[100]	valid_0's l1: 849559	valid_0's l2: 3.00634e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[101]	valid_0's l1: 849447	valid_0's l2: 3.00652e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[102]	valid_0's l1: 849650	valid_0's l2: 3.00705e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[103]	valid_0's l1: 849858	valid_0's l2: 3.00763e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[104]	valid_0's l1: 850122	valid_0's l2: 3.00828e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[105]	valid_0's l1: 850065	valid_0's l2: 3.00862e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[106]	valid_0's l1: 850152	valid_0's l2: 3.00871e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[107]	valid_0's l1: 850177	valid_0's l2: 3.00849e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[108]	valid_0's l1: 850581	valid_0's l2: 3.00957e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[109]	valid_0's l1: 850819	valid_0's l2: 3.01041e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[110]	valid_0's l1: 851087	valid_0's l2: 3.01131e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[111]	valid_0's l1: 851420	valid_0's l2: 3.01097e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[112]	valid_0's l1: 852013	valid_0's l2: 3.01083e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[113]	valid_0's l1: 851663	valid_0's l2: 3.01144e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[114]	valid_0's l1: 851880	valid_0's l2: 3.01179e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[115]	valid_0's l1: 851938	valid_0's l2: 3.01093e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[116]	valid_0's l1: 852265	valid_0's l2: 3.01074e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[117]	valid_0's l1: 852599	valid_0's l2: 3.01076e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[118]	valid_0's l1: 852780	valid_0's l2: 3.01037e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[119]	valid_0's l1: 853114	valid_0's l2: 3.0103e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[120]	valid_0's l1: 853455	valid_0's l2: 3.0103e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[121]	valid_0's l1: 854122	valid_0's l2: 3.01091e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[122]	valid_0's l1: 854584	valid_0's l2: 3.01197e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[123]	valid_0's l1: 855156	valid_0's l2: 3.01285e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[124]	valid_0's l1: 855815	valid_0's l2: 3.01363e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[125]	valid_0's l1: 855813	valid_0's l2: 3.01348e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[126]	valid_0's l1: 856402	valid_0's l2: 3.0143e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[127]	valid_0's l1: 856935	valid_0's l2: 3.01541e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[128]	valid_0's l1: 857398	valid_0's l2: 3.01623e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[129]	valid_0's l1: 858013	valid_0's l2: 3.01705e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[130]	valid_0's l1: 858507	valid_0's l2: 3.01762e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[131]	valid_0's l1: 858470	valid_0's l2: 3.0181e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[132]	valid_0's l1: 858330	valid_0's l2: 3.01858e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[133]	valid_0's l1: 858216	valid_0's l2: 3.01904e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[134]	valid_0's l1: 858156	valid_0's l2: 3.02007e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[135]	valid_0's l1: 858071	valid_0's l2: 3.02074e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[136]	valid_0's l1: 857904	valid_0's l2: 3.02193e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[137]	valid_0's l1: 857896	valid_0's l2: 3.02255e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[138]	valid_0's l1: 857812	valid_0's l2: 3.02304e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[139]	valid_0's l1: 857830	valid_0's l2: 3.02375e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[140]	valid_0's l1: 857840	valid_0's l2: 3.02435e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[141]	valid_0's l1: 858473	valid_0's l2: 3.02583e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[142]	valid_0's l1: 858915	valid_0's l2: 3.0273e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[143]	valid_0's l1: 859410	valid_0's l2: 3.02882e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[144]	valid_0's l1: 859719	valid_0's l2: 3.02981e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[145]	valid_0's l1: 860260	valid_0's l2: 3.0314e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[146]	valid_0's l1: 860414	valid_0's l2: 3.03178e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[147]	valid_0's l1: 860951	valid_0's l2: 3.03339e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[148]	valid_0's l1: 861421	valid_0's l2: 3.03397e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[149]	valid_0's l1: 862143	valid_0's l2: 3.03544e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[150]	valid_0's l1: 862363	valid_0's l2: 3.03587e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[151]	valid_0's l1: 862205	valid_0's l2: 3.0358e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[152]	valid_0's l1: 862093	valid_0's l2: 3.03567e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[153]	valid_0's l1: 862003	valid_0's l2: 3.03557e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[154]	valid_0's l1: 861929	valid_0's l2: 3.03551e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[155]	valid_0's l1: 861856	valid_0's l2: 3.03547e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[156]	valid_0's l1: 861783	valid_0's l2: 3.03545e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[157]	valid_0's l1: 861733	valid_0's l2: 3.03541e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[158]	valid_0's l1: 861661	valid_0's l2: 3.03543e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[159]	valid_0's l1: 861618	valid_0's l2: 3.0354e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[160]	valid_0's l1: 861709	valid_0's l2: 3.03601e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[161]	valid_0's l1: 861676	valid_0's l2: 3.0367e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[162]	valid_0's l1: 861657	valid_0's l2: 3.03741e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[163]	valid_0's l1: 861488	valid_0's l2: 3.03811e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[164]	valid_0's l1: 861526	valid_0's l2: 3.03871e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[165]	valid_0's l1: 861529	valid_0's l2: 3.04001e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[166]	valid_0's l1: 861526	valid_0's l2: 3.04088e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[167]	valid_0's l1: 861557	valid_0's l2: 3.04126e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[168]	valid_0's l1: 861603	valid_0's l2: 3.04188e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[169]	valid_0's l1: 861438	valid_0's l2: 3.04262e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[170]	valid_0's l1: 861625	valid_0's l2: 3.04317e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[171]	valid_0's l1: 861594	valid_0's l2: 3.04353e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[172]	valid_0's l1: 861420	valid_0's l2: 3.04446e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[173]	valid_0's l1: 861359	valid_0's l2: 3.04496e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[174]	valid_0's l1: 861352	valid_0's l2: 3.04532e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[175]	valid_0's l1: 861394	valid_0's l2: 3.04578e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[176]	valid_0's l1: 861447	valid_0's l2: 3.04626e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[177]	valid_0's l1: 861612	valid_0's l2: 3.04644e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[178]	valid_0's l1: 861837	valid_0's l2: 3.04744e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[179]	valid_0's l1: 862027	valid_0's l2: 3.04726e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[180]	valid_0's l1: 861886	valid_0's l2: 3.04762e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[181]	valid_0's l1: 862197	valid_0's l2: 3.04812e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[182]	valid_0's l1: 862595	valid_0's l2: 3.04909e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[183]	valid_0's l1: 862937	valid_0's l2: 3.05008e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[184]	valid_0's l1: 863023	valid_0's l2: 3.05079e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[185]	valid_0's l1: 863557	valid_0's l2: 3.05199e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[186]	valid_0's l1: 863868	valid_0's l2: 3.05344e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[187]	valid_0's l1: 864192	valid_0's l2: 3.05428e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[188]	valid_0's l1: 864469	valid_0's l2: 3.05579e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[189]	valid_0's l1: 864915	valid_0's l2: 3.05693e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[190]	valid_0's l1: 865294	valid_0's l2: 3.05765e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[191]	valid_0's l1: 865590	valid_0's l2: 3.05844e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[192]	valid_0's l1: 865818	valid_0's l2: 3.05925e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[193]	valid_0's l1: 866082	valid_0's l2: 3.06006e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[194]	valid_0's l1: 866325	valid_0's l2: 3.06061e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[195]	valid_0's l1: 866544	valid_0's l2: 3.06173e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[196]	valid_0's l1: 866800	valid_0's l2: 3.06266e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[197]	valid_0's l1: 867095	valid_0's l2: 3.06327e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[198]	valid_0's l1: 867349	valid_0's l2: 3.06424e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[199]	valid_0's l1: 867572	valid_0's l2: 3.06536e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[200]	valid_0's l1: 867808	valid_0's l2: 3.0663e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[201]	valid_0's l1: 868063	valid_0's l2: 3.06617e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[202]	valid_0's l1: 868230	valid_0's l2: 3.06597e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[203]	valid_0's l1: 868484	valid_0's l2: 3.06588e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[204]	valid_0's l1: 868419	valid_0's l2: 3.06662e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[205]	valid_0's l1: 868586	valid_0's l2: 3.06646e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[206]	valid_0's l1: 868378	valid_0's l2: 3.0671e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[207]	valid_0's l1: 867997	valid_0's l2: 3.06805e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[208]	valid_0's l1: 868195	valid_0's l2: 3.06788e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[209]	valid_0's l1: 868181	valid_0's l2: 3.06868e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[210]	valid_0's l1: 868183	valid_0's l2: 3.06964e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[211]	valid_0's l1: 868640	valid_0's l2: 3.07041e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[212]	valid_0's l1: 869036	valid_0's l2: 3.07091e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[213]	valid_0's l1: 869342	valid_0's l2: 3.07181e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[214]	valid_0's l1: 869743	valid_0's l2: 3.07251e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[215]	valid_0's l1: 870056	valid_0's l2: 3.07346e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[216]	valid_0's l1: 870590	valid_0's l2: 3.07459e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[217]	valid_0's l1: 871230	valid_0's l2: 3.075e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[218]	valid_0's l1: 871823	valid_0's l2: 3.07598e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[219]	valid_0's l1: 872440	valid_0's l2: 3.07686e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[220]	valid_0's l1: 873164	valid_0's l2: 3.07732e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[221]	valid_0's l1: 873575	valid_0's l2: 3.07865e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[222]	valid_0's l1: 874074	valid_0's l2: 3.08001e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[223]	valid_0's l1: 874564	valid_0's l2: 3.08102e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[224]	valid_0's l1: 874917	valid_0's l2: 3.08199e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[225]	valid_0's l1: 875267	valid_0's l2: 3.08298e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[226]	valid_0's l1: 875757	valid_0's l2: 3.08442e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[227]	valid_0's l1: 876262	valid_0's l2: 3.08587e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[228]	valid_0's l1: 876838	valid_0's l2: 3.08754e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[229]	valid_0's l1: 876583	valid_0's l2: 3.08869e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[230]	valid_0's l1: 877130	valid_0's l2: 3.09017e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[231]	valid_0's l1: 877504	valid_0's l2: 3.09094e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[232]	valid_0's l1: 877677	valid_0's l2: 3.09156e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[233]	valid_0's l1: 877890	valid_0's l2: 3.09278e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[234]	valid_0's l1: 878161	valid_0's l2: 3.09409e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[235]	valid_0's l1: 878401	valid_0's l2: 3.09476e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[236]	valid_0's l1: 879010	valid_0's l2: 3.09508e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[237]	valid_0's l1: 879337	valid_0's l2: 3.09579e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[238]	valid_0's l1: 879642	valid_0's l2: 3.09681e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[239]	valid_0's l1: 879709	valid_0's l2: 3.09775e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[240]	valid_0's l1: 879986	valid_0's l2: 3.09899e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[241]	valid_0's l1: 880043	valid_0's l2: 3.09989e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[242]	valid_0's l1: 880127	valid_0's l2: 3.10089e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[243]	valid_0's l1: 880183	valid_0's l2: 3.10183e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[244]	valid_0's l1: 880262	valid_0's l2: 3.10246e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[245]	valid_0's l1: 880136	valid_0's l2: 3.10268e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[246]	valid_0's l1: 879878	valid_0's l2: 3.10371e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[247]	valid_0's l1: 879973	valid_0's l2: 3.10475e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[248]	valid_0's l1: 880190	valid_0's l2: 3.10514e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[249]	valid_0's l1: 879884	valid_0's l2: 3.10582e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[250]	valid_0's l1: 880020	valid_0's l2: 3.10615e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[251]	valid_0's l1: 879974	valid_0's l2: 3.10578e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[252]	valid_0's l1: 879954	valid_0's l2: 3.10622e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[253]	valid_0's l1: 880173	valid_0's l2: 3.10661e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[254]	valid_0's l1: 880286	valid_0's l2: 3.10699e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[255]	valid_0's l1: 880472	valid_0's l2: 3.10751e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[256]	valid_0's l1: 880658	valid_0's l2: 3.10806e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[257]	valid_0's l1: 880586	valid_0's l2: 3.10833e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[258]	valid_0's l1: 880817	valid_0's l2: 3.10908e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[259]	valid_0's l1: 880976	valid_0's l2: 3.10972e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[260]	valid_0's l1: 881171	valid_0's l2: 3.11047e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[261]	valid_0's l1: 880928	valid_0's l2: 3.11025e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[262]	valid_0's l1: 880859	valid_0's l2: 3.11088e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[263]	valid_0's l1: 880736	valid_0's l2: 3.11079e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[264]	valid_0's l1: 880680	valid_0's l2: 3.11062e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[265]	valid_0's l1: 880474	valid_0's l2: 3.10969e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[266]	valid_0's l1: 880589	valid_0's l2: 3.10965e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[267]	valid_0's l1: 880386	valid_0's l2: 3.1092e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[268]	valid_0's l1: 880410	valid_0's l2: 3.1102e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[269]	valid_0's l1: 880507	valid_0's l2: 3.11056e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[270]	valid_0's l1: 880456	valid_0's l2: 3.1112e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[271]	valid_0's l1: 880623	valid_0's l2: 3.11142e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[272]	valid_0's l1: 881067	valid_0's l2: 3.11175e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[273]	valid_0's l1: 881196	valid_0's l2: 3.11241e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[274]	valid_0's l1: 881554	valid_0's l2: 3.11403e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[275]	valid_0's l1: 881916	valid_0's l2: 3.11413e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[276]	valid_0's l1: 882283	valid_0's l2: 3.11434e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[277]	valid_0's l1: 882975	valid_0's l2: 3.1164e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[278]	valid_0's l1: 883188	valid_0's l2: 3.11706e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[279]	valid_0's l1: 883980	valid_0's l2: 3.11816e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[280]	valid_0's l1: 884120	valid_0's l2: 3.11886e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[281]	valid_0's l1: 884420	valid_0's l2: 3.11936e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[282]	valid_0's l1: 884644	valid_0's l2: 3.12007e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[283]	valid_0's l1: 884852	valid_0's l2: 3.12024e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[284]	valid_0's l1: 885058	valid_0's l2: 3.12043e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[285]	valid_0's l1: 885282	valid_0's l2: 3.12116e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[286]	valid_0's l1: 885233	valid_0's l2: 3.12154e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[287]	valid_0's l1: 885483	valid_0's l2: 3.12212e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[288]	valid_0's l1: 885685	valid_0's l2: 3.12234e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[289]	valid_0's l1: 885639	valid_0's l2: 3.12275e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[290]	valid_0's l1: 885850	valid_0's l2: 3.1234e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[291]	valid_0's l1: 886132	valid_0's l2: 3.12497e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[292]	valid_0's l1: 886060	valid_0's l2: 3.12519e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[293]	valid_0's l1: 886520	valid_0's l2: 3.12707e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[294]	valid_0's l1: 886807	valid_0's l2: 3.12888e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[295]	valid_0's l1: 887135	valid_0's l2: 3.1306e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[296]	valid_0's l1: 887214	valid_0's l2: 3.13201e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[297]	valid_0's l1: 887332	valid_0's l2: 3.13343e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[298]	valid_0's l1: 887858	valid_0's l2: 3.13549e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[299]	valid_0's l1: 888190	valid_0's l2: 3.13735e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[300]	valid_0's l1: 888734	valid_0's l2: 3.13951e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[301]	valid_0's l1: 889005	valid_0's l2: 3.14047e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[302]	valid_0's l1: 889554	valid_0's l2: 3.14201e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[303]	valid_0's l1: 889699	valid_0's l2: 3.14303e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[304]	valid_0's l1: 889744	valid_0's l2: 3.14395e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[305]	valid_0's l1: 890163	valid_0's l2: 3.14524e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[306]	valid_0's l1: 890287	valid_0's l2: 3.1451e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[307]	valid_0's l1: 890729	valid_0's l2: 3.14666e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[308]	valid_0's l1: 891131	valid_0's l2: 3.14838e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[309]	valid_0's l1: 891533	valid_0's l2: 3.1493e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[310]	valid_0's l1: 891910	valid_0's l2: 3.15039e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[311]	valid_0's l1: 892103	valid_0's l2: 3.14992e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[312]	valid_0's l1: 892097	valid_0's l2: 3.15077e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[313]	valid_0's l1: 892056	valid_0's l2: 3.15108e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[314]	valid_0's l1: 892271	valid_0's l2: 3.15071e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[315]	valid_0's l1: 892364	valid_0's l2: 3.15174e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[316]	valid_0's l1: 892549	valid_0's l2: 3.15189e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[317]	valid_0's l1: 892881	valid_0's l2: 3.15198e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[318]	valid_0's l1: 893269	valid_0's l2: 3.15213e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[319]	valid_0's l1: 893312	valid_0's l2: 3.153e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[320]	valid_0's l1: 893668	valid_0's l2: 3.15317e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[321]	valid_0's l1: 893768	valid_0's l2: 3.15363e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[322]	valid_0's l1: 893666	valid_0's l2: 3.15427e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[323]	valid_0's l1: 893560	valid_0's l2: 3.15489e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[324]	valid_0's l1: 893492	valid_0's l2: 3.15534e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[325]	valid_0's l1: 893392	valid_0's l2: 3.156e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[326]	valid_0's l1: 893410	valid_0's l2: 3.15601e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[327]	valid_0's l1: 893271	valid_0's l2: 3.15634e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[328]	valid_0's l1: 893175	valid_0's l2: 3.15701e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[329]	valid_0's l1: 893324	valid_0's l2: 3.15783e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[330]	valid_0's l1: 893458	valid_0's l2: 3.15815e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[331]	valid_0's l1: 893834	valid_0's l2: 3.15903e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[332]	valid_0's l1: 894202	valid_0's l2: 3.15969e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[333]	valid_0's l1: 894506	valid_0's l2: 3.16015e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[334]	valid_0's l1: 894826	valid_0's l2: 3.16064e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[335]	valid_0's l1: 895067	valid_0's l2: 3.16098e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[336]	valid_0's l1: 895309	valid_0's l2: 3.16134e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[337]	valid_0's l1: 895645	valid_0's l2: 3.16225e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[338]	valid_0's l1: 895870	valid_0's l2: 3.16261e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[339]	valid_0's l1: 896117	valid_0's l2: 3.16334e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[340]	valid_0's l1: 896472	valid_0's l2: 3.16405e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[341]	valid_0's l1: 896672	valid_0's l2: 3.1643e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[342]	valid_0's l1: 896948	valid_0's l2: 3.16485e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[343]	valid_0's l1: 897146	valid_0's l2: 3.16552e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[344]	valid_0's l1: 897409	valid_0's l2: 3.16596e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[345]	valid_0's l1: 897855	valid_0's l2: 3.16714e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[346]	valid_0's l1: 898046	valid_0's l2: 3.16749e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[347]	valid_0's l1: 898356	valid_0's l2: 3.16829e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[348]	valid_0's l1: 898476	valid_0's l2: 3.16815e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[349]	valid_0's l1: 898970	valid_0's l2: 3.16909e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[350]	valid_0's l1: 899173	valid_0's l2: 3.16972e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[351]	valid_0's l1: 899518	valid_0's l2: 3.17065e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[352]	valid_0's l1: 899934	valid_0's l2: 3.17138e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[353]	valid_0's l1: 900349	valid_0's l2: 3.17213e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[354]	valid_0's l1: 900696	valid_0's l2: 3.17307e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[355]	valid_0's l1: 901004	valid_0's l2: 3.17406e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[356]	valid_0's l1: 901405	valid_0's l2: 3.17482e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[357]	valid_0's l1: 901482	valid_0's l2: 3.17533e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[358]	valid_0's l1: 901880	valid_0's l2: 3.17611e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[359]	valid_0's l1: 902276	valid_0's l2: 3.17691e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[360]	valid_0's l1: 902630	valid_0's l2: 3.17716e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[361]	valid_0's l1: 902861	valid_0's l2: 3.177e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[362]	valid_0's l1: 902964	valid_0's l2: 3.17676e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[363]	valid_0's l1: 903125	valid_0's l2: 3.17667e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[364]	valid_0's l1: 903385	valid_0's l2: 3.17719e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[365]	valid_0's l1: 903640	valid_0's l2: 3.1774e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[366]	valid_0's l1: 903901	valid_0's l2: 3.17795e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[367]	valid_0's l1: 904028	valid_0's l2: 3.17781e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[368]	valid_0's l1: 904336	valid_0's l2: 3.17864e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[369]	valid_0's l1: 904483	valid_0's l2: 3.17853e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[370]	valid_0's l1: 904898	valid_0's l2: 3.17951e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[371]	valid_0's l1: 904577	valid_0's l2: 3.1796e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[372]	valid_0's l1: 904039	valid_0's l2: 3.17913e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[373]	valid_0's l1: 903761	valid_0's l2: 3.17926e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[374]	valid_0's l1: 903227	valid_0's l2: 3.1788e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[375]	valid_0's l1: 902702	valid_0's l2: 3.17837e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[376]	valid_0's l1: 902227	valid_0's l2: 3.17829e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[377]	valid_0's l1: 901706	valid_0's l2: 3.17788e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[378]	valid_0's l1: 901237	valid_0's l2: 3.17781e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[379]	valid_0's l1: 900716	valid_0's l2: 3.17741e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[380]	valid_0's l1: 900310	valid_0's l2: 3.17744e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[381]	valid_0's l1: 900341	valid_0's l2: 3.1776e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[382]	valid_0's l1: 900387	valid_0's l2: 3.17808e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[383]	valid_0's l1: 900363	valid_0's l2: 3.17836e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[384]	valid_0's l1: 900535	valid_0's l2: 3.1791e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[385]	valid_0's l1: 900585	valid_0's l2: 3.17944e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[386]	valid_0's l1: 900798	valid_0's l2: 3.18013e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[387]	valid_0's l1: 900321	valid_0's l2: 3.17955e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[388]	valid_0's l1: 900538	valid_0's l2: 3.18019e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[389]	valid_0's l1: 900291	valid_0's l2: 3.18034e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[390]	valid_0's l1: 900543	valid_0's l2: 3.18116e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[391]	valid_0's l1: 900680	valid_0's l2: 3.18187e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[392]	valid_0's l1: 901243	valid_0's l2: 3.18315e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[393]	valid_0's l1: 901719	valid_0's l2: 3.18408e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[394]	valid_0's l1: 902106	valid_0's l2: 3.18517e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[395]	valid_0's l1: 902564	valid_0's l2: 3.18608e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[396]	valid_0's l1: 902806	valid_0's l2: 3.18739e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[397]	valid_0's l1: 903206	valid_0's l2: 3.18787e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[398]	valid_0's l1: 903652	valid_0's l2: 3.18921e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[399]	valid_0's l1: 904127	valid_0's l2: 3.19019e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[400]	valid_0's l1: 904493	valid_0's l2: 3.19067e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[401]	valid_0's l1: 904811	valid_0's l2: 3.19206e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[402]	valid_0's l1: 905365	valid_0's l2: 3.19355e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[403]	valid_0's l1: 905934	valid_0's l2: 3.19506e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[404]	valid_0's l1: 906263	valid_0's l2: 3.19572e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[405]	valid_0's l1: 906382	valid_0's l2: 3.1952e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[406]	valid_0's l1: 906748	valid_0's l2: 3.1967e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[407]	valid_0's l1: 907085	valid_0's l2: 3.19713e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[408]	valid_0's l1: 907449	valid_0's l2: 3.19769e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[409]	valid_0's l1: 908025	valid_0's l2: 3.19925e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[410]	valid_0's l1: 908391	valid_0's l2: 3.20009e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[411]	valid_0's l1: 908761	valid_0's l2: 3.2009e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[412]	valid_0's l1: 908990	valid_0's l2: 3.20165e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[413]	valid_0's l1: 909225	valid_0's l2: 3.20242e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[414]	valid_0's l1: 909542	valid_0's l2: 3.20365e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[415]	valid_0's l1: 909996	valid_0's l2: 3.20511e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[416]	valid_0's l1: 910312	valid_0's l2: 3.20626e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[417]	valid_0's l1: 910751	valid_0's l2: 3.20763e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[418]	valid_0's l1: 911079	valid_0's l2: 3.20894e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[419]	valid_0's l1: 911530	valid_0's l2: 3.21048e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[420]	valid_0's l1: 911874	valid_0's l2: 3.21121e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[421]	valid_0's l1: 911866	valid_0's l2: 3.21194e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[422]	valid_0's l1: 911861	valid_0's l2: 3.21268e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[423]	valid_0's l1: 911817	valid_0's l2: 3.21333e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[424]	valid_0's l1: 912194	valid_0's l2: 3.21487e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[425]	valid_0's l1: 912267	valid_0's l2: 3.21574e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[426]	valid_0's l1: 912255	valid_0's l2: 3.21646e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[427]	valid_0's l1: 912289	valid_0's l2: 3.2172e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[428]	valid_0's l1: 912678	valid_0's l2: 3.21877e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[429]	valid_0's l1: 912501	valid_0's l2: 3.2192e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[430]	valid_0's l1: 912738	valid_0's l2: 3.22009e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[431]	valid_0's l1: 913055	valid_0's l2: 3.22077e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[432]	valid_0's l1: 913206	valid_0's l2: 3.22021e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[433]	valid_0's l1: 913380	valid_0's l2: 3.22035e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[434]	valid_0's l1: 913732	valid_0's l2: 3.22136e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[435]	valid_0's l1: 913928	valid_0's l2: 3.22158e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[436]	valid_0's l1: 914176	valid_0's l2: 3.22197e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[437]	valid_0's l1: 914419	valid_0's l2: 3.22231e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[438]	valid_0's l1: 914649	valid_0's l2: 3.22262e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[439]	valid_0's l1: 914988	valid_0's l2: 3.22314e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[440]	valid_0's l1: 915256	valid_0's l2: 3.22358e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[441]	valid_0's l1: 915030	valid_0's l2: 3.22318e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[442]	valid_0's l1: 914805	valid_0's l2: 3.22279e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[443]	valid_0's l1: 914535	valid_0's l2: 3.22223e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[444]	valid_0's l1: 914318	valid_0's l2: 3.22188e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[445]	valid_0's l1: 914234	valid_0's l2: 3.22191e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[446]	valid_0's l1: 914026	valid_0's l2: 3.22162e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[447]	valid_0's l1: 913826	valid_0's l2: 3.22133e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[448]	valid_0's l1: 913794	valid_0's l2: 3.22062e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[449]	valid_0's l1: 913790	valid_0's l2: 3.22044e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[450]	valid_0's l1: 913828	valid_0's l2: 3.21975e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[451]	valid_0's l1: 913544	valid_0's l2: 3.21936e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[452]	valid_0's l1: 913253	valid_0's l2: 3.21928e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[453]	valid_0's l1: 912971	valid_0's l2: 3.21891e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[454]	valid_0's l1: 912611	valid_0's l2: 3.21908e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[455]	valid_0's l1: 912452	valid_0's l2: 3.21901e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[456]	valid_0's l1: 912476	valid_0's l2: 3.21983e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[457]	valid_0's l1: 912202	valid_0's l2: 3.21986e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[458]	valid_0's l1: 911924	valid_0's l2: 3.21953e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[459]	valid_0's l1: 911626	valid_0's l2: 3.21949e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[460]	valid_0's l1: 911569	valid_0's l2: 3.21947e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[461]	valid_0's l1: 912449	valid_0's l2: 3.22116e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[462]	valid_0's l1: 912571	valid_0's l2: 3.22184e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[463]	valid_0's l1: 913192	valid_0's l2: 3.22294e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[464]	valid_0's l1: 913809	valid_0's l2: 3.22442e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[465]	valid_0's l1: 914479	valid_0's l2: 3.22581e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[466]	valid_0's l1: 915005	valid_0's l2: 3.22694e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[467]	valid_0's l1: 915629	valid_0's l2: 3.22809e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[468]	valid_0's l1: 916284	valid_0's l2: 3.22957e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[469]	valid_0's l1: 916892	valid_0's l2: 3.23103e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[470]	valid_0's l1: 917513	valid_0's l2: 3.23254e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[471]	valid_0's l1: 917439	valid_0's l2: 3.23266e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[472]	valid_0's l1: 917767	valid_0's l2: 3.23291e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[473]	valid_0's l1: 918156	valid_0's l2: 3.23379e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[474]	valid_0's l1: 918244	valid_0's l2: 3.23379e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[475]	valid_0's l1: 918380	valid_0's l2: 3.23396e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[476]	valid_0's l1: 918442	valid_0's l2: 3.2341e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[477]	valid_0's l1: 918842	valid_0's l2: 3.23443e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[478]	valid_0's l1: 918887	valid_0's l2: 3.23464e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[479]	valid_0's l1: 919275	valid_0's l2: 3.23551e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[480]	valid_0's l1: 919378	valid_0's l2: 3.23592e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[481]	valid_0's l1: 919630	valid_0's l2: 3.23607e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[482]	valid_0's l1: 919900	valid_0's l2: 3.23636e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[483]	valid_0's l1: 920013	valid_0's l2: 3.23702e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[484]	valid_0's l1: 920261	valid_0's l2: 3.23723e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[485]	valid_0's l1: 920510	valid_0's l2: 3.23746e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[486]	valid_0's l1: 920762	valid_0's l2: 3.23773e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[487]	valid_0's l1: 920936	valid_0's l2: 3.23828e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[488]	valid_0's l1: 920901	valid_0's l2: 3.23756e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[489]	valid_0's l1: 921147	valid_0's l2: 3.23786e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[490]	valid_0's l1: 921398	valid_0's l2: 3.23818e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[491]	valid_0's l1: 921570	valid_0's l2: 3.23881e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[492]	valid_0's l1: 921912	valid_0's l2: 3.23966e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[493]	valid_0's l1: 922172	valid_0's l2: 3.24079e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[494]	valid_0's l1: 922631	valid_0's l2: 3.24191e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[495]	valid_0's l1: 922544	valid_0's l2: 3.24265e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[496]	valid_0's l1: 922584	valid_0's l2: 3.24306e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[497]	valid_0's l1: 922992	valid_0's l2: 3.24384e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[498]	valid_0's l1: 923476	valid_0's l2: 3.24505e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[499]	valid_0's l1: 923746	valid_0's l2: 3.24615e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[500]	valid_0's l1: 923976	valid_0's l2: 3.24736e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[501]	valid_0's l1: 924286	valid_0's l2: 3.24811e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[502]	valid_0's l1: 924602	valid_0's l2: 3.24889e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[503]	valid_0's l1: 924807	valid_0's l2: 3.24978e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[504]	valid_0's l1: 925031	valid_0's l2: 3.2504e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[505]	valid_0's l1: 925208	valid_0's l2: 3.25058e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[506]	valid_0's l1: 925445	valid_0's l2: 3.25123e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[507]	valid_0's l1: 925741	valid_0's l2: 3.252e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[508]	valid_0's l1: 925975	valid_0's l2: 3.25269e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[509]	valid_0's l1: 926209	valid_0's l2: 3.2534e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[510]	valid_0's l1: 926288	valid_0's l2: 3.25411e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[511]	valid_0's l1: 926735	valid_0's l2: 3.25489e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[512]	valid_0's l1: 926909	valid_0's l2: 3.25588e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[513]	valid_0's l1: 927334	valid_0's l2: 3.25644e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[514]	valid_0's l1: 927766	valid_0's l2: 3.25704e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[515]	valid_0's l1: 927842	valid_0's l2: 3.25793e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[516]	valid_0's l1: 928101	valid_0's l2: 3.25807e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[517]	valid_0's l1: 928556	valid_0's l2: 3.25876e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[518]	valid_0's l1: 928924	valid_0's l2: 3.25889e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[519]	valid_0's l1: 929412	valid_0's l2: 3.25947e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[520]	valid_0's l1: 929869	valid_0's l2: 3.2601e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[521]	valid_0's l1: 930034	valid_0's l2: 3.26135e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[522]	valid_0's l1: 930199	valid_0's l2: 3.26261e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[523]	valid_0's l1: 930303	valid_0's l2: 3.26313e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[524]	valid_0's l1: 930492	valid_0's l2: 3.26424e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[525]	valid_0's l1: 930619	valid_0's l2: 3.26479e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[526]	valid_0's l1: 930667	valid_0's l2: 3.2655e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[527]	valid_0's l1: 930857	valid_0's l2: 3.2666e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[528]	valid_0's l1: 930931	valid_0's l2: 3.26759e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[529]	valid_0's l1: 931062	valid_0's l2: 3.26818e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[530]	valid_0's l1: 931270	valid_0's l2: 3.26936e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[531]	valid_0's l1: 930914	valid_0's l2: 3.26912e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[532]	valid_0's l1: 930517	valid_0's l2: 3.26878e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[533]	valid_0's l1: 930160	valid_0's l2: 3.26847e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[534]	valid_0's l1: 930343	valid_0's l2: 3.26897e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[535]	valid_0's l1: 930027	valid_0's l2: 3.26875e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[536]	valid_0's l1: 929737	valid_0's l2: 3.26855e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[537]	valid_0's l1: 929422	valid_0's l2: 3.2683e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[538]	valid_0's l1: 929113	valid_0's l2: 3.26815e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[539]	valid_0's l1: 928969	valid_0's l2: 3.26801e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[540]	valid_0's l1: 928667	valid_0's l2: 3.2679e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[541]	valid_0's l1: 928880	valid_0's l2: 3.26769e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[542]	valid_0's l1: 929097	valid_0's l2: 3.2672e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[543]	valid_0's l1: 929303	valid_0's l2: 3.26757e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[544]	valid_0's l1: 929494	valid_0's l2: 3.26764e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[545]	valid_0's l1: 929711	valid_0's l2: 3.26825e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[546]	valid_0's l1: 930028	valid_0's l2: 3.26823e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[547]	valid_0's l1: 930208	valid_0's l2: 3.26853e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[548]	valid_0's l1: 930415	valid_0's l2: 3.26837e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[549]	valid_0's l1: 930530	valid_0's l2: 3.26875e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[550]	valid_0's l1: 930705	valid_0's l2: 3.26876e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[551]	valid_0's l1: 930688	valid_0's l2: 3.26746e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[552]	valid_0's l1: 930687	valid_0's l2: 3.26616e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[553]	valid_0's l1: 930876	valid_0's l2: 3.2656e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[554]	valid_0's l1: 931094	valid_0's l2: 3.26505e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[555]	valid_0's l1: 931309	valid_0's l2: 3.26453e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[556]	valid_0's l1: 931397	valid_0's l2: 3.26443e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[557]	valid_0's l1: 931309	valid_0's l2: 3.26366e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[558]	valid_0's l1: 931209	valid_0's l2: 3.26291e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[559]	valid_0's l1: 931424	valid_0's l2: 3.26244e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[560]	valid_0's l1: 931641	valid_0's l2: 3.26199e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[561]	valid_0's l1: 932046	valid_0's l2: 3.26337e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[562]	valid_0's l1: 932632	valid_0's l2: 3.26433e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[563]	valid_0's l1: 933182	valid_0's l2: 3.26581e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[564]	valid_0's l1: 933661	valid_0's l2: 3.26687e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[565]	valid_0's l1: 934187	valid_0's l2: 3.26842e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[566]	valid_0's l1: 934693	valid_0's l2: 3.26996e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[567]	valid_0's l1: 935181	valid_0's l2: 3.27132e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[568]	valid_0's l1: 935647	valid_0's l2: 3.27292e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[569]	valid_0's l1: 936224	valid_0's l2: 3.2746e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[570]	valid_0's l1: 936687	valid_0's l2: 3.27605e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[571]	valid_0's l1: 936967	valid_0's l2: 3.27614e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[572]	valid_0's l1: 937461	valid_0's l2: 3.27762e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[573]	valid_0's l1: 937661	valid_0's l2: 3.27792e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[574]	valid_0's l1: 938098	valid_0's l2: 3.27818e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[575]	valid_0's l1: 938320	valid_0's l2: 3.27802e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[576]	valid_0's l1: 938550	valid_0's l2: 3.2791e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[577]	valid_0's l1: 938773	valid_0's l2: 3.27895e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[578]	valid_0's l1: 938994	valid_0's l2: 3.27881e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[579]	valid_0's l1: 939026	valid_0's l2: 3.27909e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[580]	valid_0's l1: 939250	valid_0's l2: 3.2797e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[581]	valid_0's l1: 939790	valid_0's l2: 3.28055e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[582]	valid_0's l1: 940315	valid_0's l2: 3.28163e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[583]	valid_0's l1: 940983	valid_0's l2: 3.28338e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[584]	valid_0's l1: 941594	valid_0's l2: 3.28473e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[585]	valid_0's l1: 942311	valid_0's l2: 3.28609e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[586]	valid_0's l1: 943093	valid_0's l2: 3.28775e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[587]	valid_0's l1: 943711	valid_0's l2: 3.28929e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[588]	valid_0's l1: 944343	valid_0's l2: 3.29067e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[589]	valid_0's l1: 944546	valid_0's l2: 3.29088e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[590]	valid_0's l1: 945322	valid_0's l2: 3.29228e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[591]	valid_0's l1: 945225	valid_0's l2: 3.29294e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[592]	valid_0's l1: 945464	valid_0's l2: 3.29311e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[593]	valid_0's l1: 945280	valid_0's l2: 3.29288e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[594]	valid_0's l1: 945497	valid_0's l2: 3.293e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[595]	valid_0's l1: 945342	valid_0's l2: 3.29322e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[596]	valid_0's l1: 945182	valid_0's l2: 3.29304e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[597]	valid_0's l1: 945008	valid_0's l2: 3.29284e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[598]	valid_0's l1: 945126	valid_0's l2: 3.2935e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[599]	valid_0's l1: 945142	valid_0's l2: 3.29397e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[600]	valid_0's l1: 945330	valid_0's l2: 3.29436e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[601]	valid_0's l1: 945805	valid_0's l2: 3.29577e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[602]	valid_0's l1: 945915	valid_0's l2: 3.29667e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[603]	valid_0's l1: 945818	valid_0's l2: 3.29709e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[604]	valid_0's l1: 946321	valid_0's l2: 3.29853e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[605]	valid_0's l1: 947006	valid_0's l2: 3.29983e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[606]	valid_0's l1: 947478	valid_0's l2: 3.30093e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[607]	valid_0's l1: 947988	valid_0's l2: 3.30245e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[608]	valid_0's l1: 948469	valid_0's l2: 3.30361e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[609]	valid_0's l1: 949335	valid_0's l2: 3.3053e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[610]	valid_0's l1: 949324	valid_0's l2: 3.30539e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[611]	valid_0's l1: 949681	valid_0's l2: 3.30644e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[612]	valid_0's l1: 950028	valid_0's l2: 3.30755e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[613]	valid_0's l1: 950471	valid_0's l2: 3.30873e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[614]	valid_0's l1: 950906	valid_0's l2: 3.30998e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[615]	valid_0's l1: 951259	valid_0's l2: 3.31094e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[616]	valid_0's l1: 951634	valid_0's l2: 3.31194e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[617]	valid_0's l1: 952036	valid_0's l2: 3.31313e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[618]	valid_0's l1: 952347	valid_0's l2: 3.31432e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[619]	valid_0's l1: 952745	valid_0's l2: 3.31549e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[620]	valid_0's l1: 953036	valid_0's l2: 3.31668e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[621]	valid_0's l1: 952814	valid_0's l2: 3.31666e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[622]	valid_0's l1: 952480	valid_0's l2: 3.31698e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[623]	valid_0's l1: 952266	valid_0's l2: 3.31698e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[624]	valid_0's l1: 952167	valid_0's l2: 3.31776e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[625]	valid_0's l1: 951785	valid_0's l2: 3.31776e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[626]	valid_0's l1: 951561	valid_0's l2: 3.31779e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[627]	valid_0's l1: 951393	valid_0's l2: 3.31804e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[628]	valid_0's l1: 951125	valid_0's l2: 3.31804e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[629]	valid_0's l1: 950914	valid_0's l2: 3.3181e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[630]	valid_0's l1: 950691	valid_0's l2: 3.31819e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[631]	valid_0's l1: 951021	valid_0's l2: 3.31894e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[632]	valid_0's l1: 951361	valid_0's l2: 3.31955e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[633]	valid_0's l1: 951670	valid_0's l2: 3.32019e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[634]	valid_0's l1: 952071	valid_0's l2: 3.32169e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[635]	valid_0's l1: 952414	valid_0's l2: 3.32235e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[636]	valid_0's l1: 952825	valid_0's l2: 3.32387e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[637]	valid_0's l1: 953289	valid_0's l2: 3.32535e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[638]	valid_0's l1: 953611	valid_0's l2: 3.32639e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[639]	valid_0's l1: 954045	valid_0's l2: 3.32817e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[640]	valid_0's l1: 954243	valid_0's l2: 3.32842e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[641]	valid_0's l1: 954391	valid_0's l2: 3.32906e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[642]	valid_0's l1: 954533	valid_0's l2: 3.32972e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[643]	valid_0's l1: 954610	valid_0's l2: 3.32981e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[644]	valid_0's l1: 954629	valid_0's l2: 3.33002e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[645]	valid_0's l1: 954649	valid_0's l2: 3.33024e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[646]	valid_0's l1: 954677	valid_0's l2: 3.33041e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[647]	valid_0's l1: 954697	valid_0's l2: 3.33065e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[648]	valid_0's l1: 954701	valid_0's l2: 3.33089e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[649]	valid_0's l1: 954721	valid_0's l2: 3.33116e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[650]	valid_0's l1: 954873	valid_0's l2: 3.33223e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[651]	valid_0's l1: 954860	valid_0's l2: 3.33172e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[652]	valid_0's l1: 955084	valid_0's l2: 3.33195e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[653]	valid_0's l1: 955113	valid_0's l2: 3.33176e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[654]	valid_0's l1: 955084	valid_0's l2: 3.33155e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[655]	valid_0's l1: 954828	valid_0's l2: 3.33105e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[656]	valid_0's l1: 954805	valid_0's l2: 3.33087e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[657]	valid_0's l1: 954784	valid_0's l2: 3.33063e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[658]	valid_0's l1: 954710	valid_0's l2: 3.33001e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[659]	valid_0's l1: 954765	valid_0's l2: 3.3301e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[660]	valid_0's l1: 954862	valid_0's l2: 3.33004e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[661]	valid_0's l1: 954832	valid_0's l2: 3.33026e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[662]	valid_0's l1: 955022	valid_0's l2: 3.33144e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[663]	valid_0's l1: 955306	valid_0's l2: 3.33218e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[664]	valid_0's l1: 955494	valid_0's l2: 3.33338e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[665]	valid_0's l1: 955691	valid_0's l2: 3.33473e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[666]	valid_0's l1: 955918	valid_0's l2: 3.33587e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[667]	valid_0's l1: 955966	valid_0's l2: 3.33648e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[668]	valid_0's l1: 956192	valid_0's l2: 3.33746e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[669]	valid_0's l1: 956538	valid_0's l2: 3.33862e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[670]	valid_0's l1: 956771	valid_0's l2: 3.33964e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[671]	valid_0's l1: 956550	valid_0's l2: 3.33874e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[672]	valid_0's l1: 956669	valid_0's l2: 3.33839e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[673]	valid_0's l1: 956726	valid_0's l2: 3.33868e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[674]	valid_0's l1: 956875	valid_0's l2: 3.33857e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[675]	valid_0's l1: 956693	valid_0's l2: 3.33793e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[676]	valid_0's l1: 956640	valid_0's l2: 3.33721e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[677]	valid_0's l1: 956421	valid_0's l2: 3.3364e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[678]	valid_0's l1: 956478	valid_0's l2: 3.33675e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[679]	valid_0's l1: 956261	valid_0's l2: 3.33598e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[680]	valid_0's l1: 956045	valid_0's l2: 3.33523e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[681]	valid_0's l1: 955977	valid_0's l2: 3.33498e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[682]	valid_0's l1: 955877	valid_0's l2: 3.33465e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[683]	valid_0's l1: 955757	valid_0's l2: 3.33549e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[684]	valid_0's l1: 955527	valid_0's l2: 3.3351e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[685]	valid_0's l1: 955461	valid_0's l2: 3.33489e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[686]	valid_0's l1: 955208	valid_0's l2: 3.33443e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[687]	valid_0's l1: 954884	valid_0's l2: 3.33378e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[688]	valid_0's l1: 954615	valid_0's l2: 3.33396e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[689]	valid_0's l1: 954648	valid_0's l2: 3.33369e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[690]	valid_0's l1: 954378	valid_0's l2: 3.33389e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[691]	valid_0's l1: 954716	valid_0's l2: 3.33469e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[692]	valid_0's l1: 955166	valid_0's l2: 3.33508e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[693]	valid_0's l1: 955300	valid_0's l2: 3.33561e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[694]	valid_0's l1: 955796	valid_0's l2: 3.33632e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[695]	valid_0's l1: 956154	valid_0's l2: 3.33708e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[696]	valid_0's l1: 956532	valid_0's l2: 3.33824e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[697]	valid_0's l1: 956887	valid_0's l2: 3.33826e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[698]	valid_0's l1: 957021	valid_0's l2: 3.33881e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[699]	valid_0's l1: 957326	valid_0's l2: 3.33873e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[700]	valid_0's l1: 957458	valid_0's l2: 3.33927e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[701]	valid_0's l1: 957586	valid_0's l2: 3.33907e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[702]	valid_0's l1: 957718	valid_0's l2: 3.33884e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[703]	valid_0's l1: 957517	valid_0's l2: 3.33844e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[704]	valid_0's l1: 957761	valid_0's l2: 3.33879e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[705]	valid_0's l1: 957617	valid_0's l2: 3.33834e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[706]	valid_0's l1: 957750	valid_0's l2: 3.33817e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[707]	valid_0's l1: 957597	valid_0's l2: 3.33801e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[708]	valid_0's l1: 957530	valid_0's l2: 3.33775e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[709]	valid_0's l1: 957674	valid_0's l2: 3.33761e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[710]	valid_0's l1: 957444	valid_0's l2: 3.33699e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[711]	valid_0's l1: 957211	valid_0's l2: 3.33598e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[712]	valid_0's l1: 957023	valid_0's l2: 3.33551e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[713]	valid_0's l1: 956764	valid_0's l2: 3.33523e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[714]	valid_0's l1: 956624	valid_0's l2: 3.33552e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[715]	valid_0's l1: 956388	valid_0's l2: 3.33454e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[716]	valid_0's l1: 956332	valid_0's l2: 3.33473e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[717]	valid_0's l1: 956077	valid_0's l2: 3.33402e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[718]	valid_0's l1: 955986	valid_0's l2: 3.3341e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[719]	valid_0's l1: 955909	valid_0's l2: 3.33436e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[720]	valid_0's l1: 955832	valid_0's l2: 3.33463e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[721]	valid_0's l1: 955956	valid_0's l2: 3.33478e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[722]	valid_0's l1: 956081	valid_0's l2: 3.33496e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[723]	valid_0's l1: 956326	valid_0's l2: 3.33526e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[724]	valid_0's l1: 956592	valid_0's l2: 3.3356e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[725]	valid_0's l1: 956845	valid_0's l2: 3.33594e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[726]	valid_0's l1: 956893	valid_0's l2: 3.33598e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[727]	valid_0's l1: 956955	valid_0's l2: 3.33591e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[728]	valid_0's l1: 957241	valid_0's l2: 3.33627e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[729]	valid_0's l1: 957637	valid_0's l2: 3.33665e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[730]	valid_0's l1: 957932	valid_0's l2: 3.33704e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[731]	valid_0's l1: 957861	valid_0's l2: 3.33726e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[732]	valid_0's l1: 957852	valid_0's l2: 3.33751e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[733]	valid_0's l1: 957825	valid_0's l2: 3.33791e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[734]	valid_0's l1: 957746	valid_0's l2: 3.33816e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[735]	valid_0's l1: 957699	valid_0's l2: 3.33843e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[736]	valid_0's l1: 957661	valid_0's l2: 3.33871e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[737]	valid_0's l1: 957626	valid_0's l2: 3.33898e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[738]	valid_0's l1: 957803	valid_0's l2: 3.33994e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[739]	valid_0's l1: 957761	valid_0's l2: 3.33996e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[740]	valid_0's l1: 958192	valid_0's l2: 3.34109e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[741]	valid_0's l1: 958455	valid_0's l2: 3.3419e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[742]	valid_0's l1: 958481	valid_0's l2: 3.34246e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[743]	valid_0's l1: 958392	valid_0's l2: 3.3425e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[744]	valid_0's l1: 958330	valid_0's l2: 3.34256e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[745]	valid_0's l1: 958408	valid_0's l2: 3.34316e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[746]	valid_0's l1: 958379	valid_0's l2: 3.34328e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[747]	valid_0's l1: 958350	valid_0's l2: 3.34342e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[748]	valid_0's l1: 958291	valid_0's l2: 3.34374e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[749]	valid_0's l1: 958149	valid_0's l2: 3.34347e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[750]	valid_0's l1: 958124	valid_0's l2: 3.34364e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[751]	valid_0's l1: 957951	valid_0's l2: 3.34379e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[752]	valid_0's l1: 957726	valid_0's l2: 3.34377e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[753]	valid_0's l1: 957573	valid_0's l2: 3.34394e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[754]	valid_0's l1: 957642	valid_0's l2: 3.3441e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[755]	valid_0's l1: 957488	valid_0's l2: 3.34431e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[756]	valid_0's l1: 957358	valid_0's l2: 3.34453e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[757]	valid_0's l1: 957218	valid_0's l2: 3.34476e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[758]	valid_0's l1: 957084	valid_0's l2: 3.34496e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[759]	valid_0's l1: 956989	valid_0's l2: 3.34476e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[760]	valid_0's l1: 957070	valid_0's l2: 3.34494e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[761]	valid_0's l1: 957423	valid_0's l2: 3.3463e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[762]	valid_0's l1: 958108	valid_0's l2: 3.34803e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[763]	valid_0's l1: 958723	valid_0's l2: 3.34952e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[764]	valid_0's l1: 959168	valid_0's l2: 3.35086e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[765]	valid_0's l1: 959949	valid_0's l2: 3.35271e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[766]	valid_0's l1: 960319	valid_0's l2: 3.35441e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[767]	valid_0's l1: 960887	valid_0's l2: 3.35497e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[768]	valid_0's l1: 961678	valid_0's l2: 3.35686e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[769]	valid_0's l1: 961916	valid_0's l2: 3.35761e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[770]	valid_0's l1: 962575	valid_0's l2: 3.35853e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[771]	valid_0's l1: 962892	valid_0's l2: 3.35969e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[772]	valid_0's l1: 963297	valid_0's l2: 3.36129e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[773]	valid_0's l1: 963539	valid_0's l2: 3.36269e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[774]	valid_0's l1: 963721	valid_0's l2: 3.36338e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[775]	valid_0's l1: 963978	valid_0's l2: 3.36461e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[776]	valid_0's l1: 964302	valid_0's l2: 3.36594e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[777]	valid_0's l1: 964420	valid_0's l2: 3.36649e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[778]	valid_0's l1: 964785	valid_0's l2: 3.36792e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[779]	valid_0's l1: 964912	valid_0's l2: 3.36881e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[780]	valid_0's l1: 965199	valid_0's l2: 3.36998e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[781]	valid_0's l1: 965614	valid_0's l2: 3.37136e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[782]	valid_0's l1: 966063	valid_0's l2: 3.37273e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[783]	valid_0's l1: 966313	valid_0's l2: 3.37329e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[784]	valid_0's l1: 966748	valid_0's l2: 3.37457e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[785]	valid_0's l1: 967198	valid_0's l2: 3.37591e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[786]	valid_0's l1: 967273	valid_0's l2: 3.37641e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[787]	valid_0's l1: 967354	valid_0's l2: 3.37692e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[788]	valid_0's l1: 967434	valid_0's l2: 3.37744e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[789]	valid_0's l1: 967892	valid_0's l2: 3.37891e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[790]	valid_0's l1: 967757	valid_0's l2: 3.37897e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[791]	valid_0's l1: 967749	valid_0's l2: 3.3789e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[792]	valid_0's l1: 967928	valid_0's l2: 3.37921e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[793]	valid_0's l1: 967904	valid_0's l2: 3.37946e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[794]	valid_0's l1: 968158	valid_0's l2: 3.37989e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[795]	valid_0's l1: 968272	valid_0's l2: 3.38017e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[796]	valid_0's l1: 968252	valid_0's l2: 3.38008e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[797]	valid_0's l1: 968242	valid_0's l2: 3.38003e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[798]	valid_0's l1: 968234	valid_0's l2: 3.37997e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[799]	valid_0's l1: 968502	valid_0's l2: 3.38068e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[800]	valid_0's l1: 968481	valid_0's l2: 3.38062e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[801]	valid_0's l1: 968365	valid_0's l2: 3.38025e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[802]	valid_0's l1: 968163	valid_0's l2: 3.37968e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[803]	valid_0's l1: 967963	valid_0's l2: 3.37915e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[804]	valid_0's l1: 967794	valid_0's l2: 3.37863e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[805]	valid_0's l1: 967651	valid_0's l2: 3.37813e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[806]	valid_0's l1: 967589	valid_0's l2: 3.37787e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[807]	valid_0's l1: 968034	valid_0's l2: 3.37914e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[808]	valid_0's l1: 967892	valid_0's l2: 3.37869e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[809]	valid_0's l1: 967961	valid_0's l2: 3.37904e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[810]	valid_0's l1: 967907	valid_0's l2: 3.37882e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[811]	valid_0's l1: 968449	valid_0's l2: 3.38109e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[812]	valid_0's l1: 968991	valid_0's l2: 3.38338e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[813]	valid_0's l1: 969678	valid_0's l2: 3.38485e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[814]	valid_0's l1: 970583	valid_0's l2: 3.3881e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[815]	valid_0's l1: 971265	valid_0's l2: 3.38961e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[816]	valid_0's l1: 972143	valid_0's l2: 3.39289e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[817]	valid_0's l1: 972688	valid_0's l2: 3.39516e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[818]	valid_0's l1: 972838	valid_0's l2: 3.39541e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[819]	valid_0's l1: 973180	valid_0's l2: 3.39733e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[820]	valid_0's l1: 973971	valid_0's l2: 3.40014e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[821]	valid_0's l1: 973872	valid_0's l2: 3.39976e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[822]	valid_0's l1: 973790	valid_0's l2: 3.39943e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[823]	valid_0's l1: 973874	valid_0's l2: 3.39918e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[824]	valid_0's l1: 973958	valid_0's l2: 3.39895e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[825]	valid_0's l1: 973883	valid_0's l2: 3.39864e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[826]	valid_0's l1: 973809	valid_0's l2: 3.3983e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[827]	valid_0's l1: 973953	valid_0's l2: 3.3984e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[828]	valid_0's l1: 973747	valid_0's l2: 3.39803e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[829]	valid_0's l1: 973754	valid_0's l2: 3.39776e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[830]	valid_0's l1: 973687	valid_0's l2: 3.39745e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[831]	valid_0's l1: 973817	valid_0's l2: 3.39847e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[832]	valid_0's l1: 973929	valid_0's l2: 3.39938e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[833]	valid_0's l1: 974079	valid_0's l2: 3.40041e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[834]	valid_0's l1: 974209	valid_0's l2: 3.40133e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[835]	valid_0's l1: 974288	valid_0's l2: 3.40156e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[836]	valid_0's l1: 974437	valid_0's l2: 3.40259e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[837]	valid_0's l1: 974433	valid_0's l2: 3.40354e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[838]	valid_0's l1: 974429	valid_0's l2: 3.4045e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[839]	valid_0's l1: 974560	valid_0's l2: 3.40543e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[840]	valid_0's l1: 974777	valid_0's l2: 3.40667e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[841]	valid_0's l1: 975001	valid_0's l2: 3.40703e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[842]	valid_0's l1: 975224	valid_0's l2: 3.40742e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[843]	valid_0's l1: 975473	valid_0's l2: 3.40783e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[844]	valid_0's l1: 975571	valid_0's l2: 3.40778e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[845]	valid_0's l1: 975806	valid_0's l2: 3.40826e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[846]	valid_0's l1: 975976	valid_0's l2: 3.40886e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[847]	valid_0's l1: 976224	valid_0's l2: 3.40934e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[848]	valid_0's l1: 976470	valid_0's l2: 3.40984e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[849]	valid_0's l1: 976759	valid_0's l2: 3.41057e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[850]	valid_0's l1: 977002	valid_0's l2: 3.41111e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[851]	valid_0's l1: 976980	valid_0's l2: 3.41173e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[852]	valid_0's l1: 976934	valid_0's l2: 3.41219e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[853]	valid_0's l1: 976947	valid_0's l2: 3.41262e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[854]	valid_0's l1: 977066	valid_0's l2: 3.4135e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[855]	valid_0's l1: 976943	valid_0's l2: 3.41395e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[856]	valid_0's l1: 977001	valid_0's l2: 3.41452e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[857]	valid_0's l1: 977041	valid_0's l2: 3.41496e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[858]	valid_0's l1: 977025	valid_0's l2: 3.41553e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[859]	valid_0's l1: 977079	valid_0's l2: 3.41612e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[860]	valid_0's l1: 977073	valid_0's l2: 3.41696e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[861]	valid_0's l1: 977002	valid_0's l2: 3.41702e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[862]	valid_0's l1: 976698	valid_0's l2: 3.41678e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[863]	valid_0's l1: 976627	valid_0's l2: 3.41685e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[864]	valid_0's l1: 976557	valid_0's l2: 3.41693e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[865]	valid_0's l1: 976289	valid_0's l2: 3.41647e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[866]	valid_0's l1: 976231	valid_0's l2: 3.4166e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[867]	valid_0's l1: 976109	valid_0's l2: 3.41643e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[868]	valid_0's l1: 976040	valid_0's l2: 3.41653e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[869]	valid_0's l1: 975919	valid_0's l2: 3.41639e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[870]	valid_0's l1: 975850	valid_0's l2: 3.41651e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[871]	valid_0's l1: 975805	valid_0's l2: 3.4168e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[872]	valid_0's l1: 975760	valid_0's l2: 3.4171e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[873]	valid_0's l1: 975714	valid_0's l2: 3.41742e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[874]	valid_0's l1: 975592	valid_0's l2: 3.41741e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[875]	valid_0's l1: 975668	valid_0's l2: 3.41803e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[876]	valid_0's l1: 975494	valid_0's l2: 3.41786e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[877]	valid_0's l1: 975376	valid_0's l2: 3.4178e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[878]	valid_0's l1: 975205	valid_0's l2: 3.41764e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[879]	valid_0's l1: 975055	valid_0's l2: 3.41748e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[880]	valid_0's l1: 975025	valid_0's l2: 3.41785e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[881]	valid_0's l1: 975250	valid_0's l2: 3.41759e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[882]	valid_0's l1: 975349	valid_0's l2: 3.4176e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[883]	valid_0's l1: 975358	valid_0's l2: 3.41754e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[884]	valid_0's l1: 975295	valid_0's l2: 3.41746e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[885]	valid_0's l1: 975330	valid_0's l2: 3.41707e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[886]	valid_0's l1: 975422	valid_0's l2: 3.41728e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[887]	valid_0's l1: 975706	valid_0's l2: 3.41844e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[888]	valid_0's l1: 975873	valid_0's l2: 3.41836e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[889]	valid_0's l1: 976184	valid_0's l2: 3.41947e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[890]	valid_0's l1: 976282	valid_0's l2: 3.41941e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[891]	valid_0's l1: 976330	valid_0's l2: 3.41936e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[892]	valid_0's l1: 976578	valid_0's l2: 3.41986e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[893]	valid_0's l1: 976754	valid_0's l2: 3.42e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[894]	valid_0's l1: 976757	valid_0's l2: 3.42005e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[895]	valid_0's l1: 977026	valid_0's l2: 3.42055e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[896]	valid_0's l1: 977370	valid_0's l2: 3.42128e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[897]	valid_0's l1: 977661	valid_0's l2: 3.42181e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[898]	valid_0's l1: 977776	valid_0's l2: 3.42206e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[899]	valid_0's l1: 977566	valid_0's l2: 3.42139e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[900]	valid_0's l1: 977595	valid_0's l2: 3.42159e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[901]	valid_0's l1: 977844	valid_0's l2: 3.42228e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[902]	valid_0's l1: 977959	valid_0's l2: 3.42287e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[903]	valid_0's l1: 977667	valid_0's l2: 3.42234e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[904]	valid_0's l1: 977762	valid_0's l2: 3.42286e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[905]	valid_0's l1: 977471	valid_0's l2: 3.42234e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[906]	valid_0's l1: 977922	valid_0's l2: 3.42357e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[907]	valid_0's l1: 977585	valid_0's l2: 3.42246e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[908]	valid_0's l1: 977676	valid_0's l2: 3.42299e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[909]	valid_0's l1: 977884	valid_0's l2: 3.42379e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[910]	valid_0's l1: 977684	valid_0's l2: 3.42323e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[911]	valid_0's l1: 977849	valid_0's l2: 3.42334e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[912]	valid_0's l1: 978048	valid_0's l2: 3.42353e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[913]	valid_0's l1: 978176	valid_0's l2: 3.42359e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[914]	valid_0's l1: 978305	valid_0's l2: 3.42366e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[915]	valid_0's l1: 978499	valid_0's l2: 3.42429e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[916]	valid_0's l1: 978631	valid_0's l2: 3.42438e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[917]	valid_0's l1: 978811	valid_0's l2: 3.42499e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[918]	valid_0's l1: 979430	valid_0's l2: 3.42598e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[919]	valid_0's l1: 979578	valid_0's l2: 3.4261e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[920]	valid_0's l1: 979806	valid_0's l2: 3.42636e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[921]	valid_0's l1: 980449	valid_0's l2: 3.42789e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[922]	valid_0's l1: 981287	valid_0's l2: 3.42954e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[923]	valid_0's l1: 981183	valid_0's l2: 3.42966e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[924]	valid_0's l1: 982025	valid_0's l2: 3.43132e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[925]	valid_0's l1: 981953	valid_0's l2: 3.43143e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[926]	valid_0's l1: 982333	valid_0's l2: 3.43171e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[927]	valid_0's l1: 982911	valid_0's l2: 3.43296e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[928]	valid_0's l1: 983807	valid_0's l2: 3.43482e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[929]	valid_0's l1: 983669	valid_0's l2: 3.4349e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[930]	valid_0's l1: 984519	valid_0's l2: 3.43665e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[931]	valid_0's l1: 984574	valid_0's l2: 3.43711e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[932]	valid_0's l1: 984734	valid_0's l2: 3.43758e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[933]	valid_0's l1: 984821	valid_0's l2: 3.43824e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[934]	valid_0's l1: 984845	valid_0's l2: 3.43846e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[935]	valid_0's l1: 984731	valid_0's l2: 3.43864e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[936]	valid_0's l1: 984864	valid_0's l2: 3.43922e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[937]	valid_0's l1: 984848	valid_0's l2: 3.43958e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[938]	valid_0's l1: 985005	valid_0's l2: 3.44006e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[939]	valid_0's l1: 985150	valid_0's l2: 3.4406e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[940]	valid_0's l1: 985037	valid_0's l2: 3.44079e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[941]	valid_0's l1: 984958	valid_0's l2: 3.44055e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[942]	valid_0's l1: 984852	valid_0's l2: 3.43994e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[943]	valid_0's l1: 984746	valid_0's l2: 3.43935e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[944]	valid_0's l1: 984641	valid_0's l2: 3.43878e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[945]	valid_0's l1: 984537	valid_0's l2: 3.43823e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[946]	valid_0's l1: 984433	valid_0's l2: 3.4377e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[947]	valid_0's l1: 984359	valid_0's l2: 3.43753e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[948]	valid_0's l1: 984251	valid_0's l2: 3.43696e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[949]	valid_0's l1: 984178	valid_0's l2: 3.43648e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[950]	valid_0's l1: 984106	valid_0's l2: 3.43602e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[951]	valid_0's l1: 983927	valid_0's l2: 3.43544e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[952]	valid_0's l1: 983857	valid_0's l2: 3.43549e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[953]	valid_0's l1: 983572	valid_0's l2: 3.43474e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[954]	valid_0's l1: 983396	valid_0's l2: 3.43418e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[955]	valid_0's l1: 983220	valid_0's l2: 3.43364e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[956]	valid_0's l1: 983045	valid_0's l2: 3.43311e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[957]	valid_0's l1: 982871	valid_0's l2: 3.43258e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[958]	valid_0's l1: 982642	valid_0's l2: 3.43202e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[959]	valid_0's l1: 982376	valid_0's l2: 3.43134e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[960]	valid_0's l1: 982137	valid_0's l2: 3.43067e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[961]	valid_0's l1: 982250	valid_0's l2: 3.431e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[962]	valid_0's l1: 981963	valid_0's l2: 3.43066e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[963]	valid_0's l1: 981920	valid_0's l2: 3.43119e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[964]	valid_0's l1: 981984	valid_0's l2: 3.4315e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[965]	valid_0's l1: 982081	valid_0's l2: 3.43189e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[966]	valid_0's l1: 981805	valid_0's l2: 3.43159e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[967]	valid_0's l1: 982094	valid_0's l2: 3.43235e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[968]	valid_0's l1: 982003	valid_0's l2: 3.43206e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[969]	valid_0's l1: 982363	valid_0's l2: 3.43299e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[970]	valid_0's l1: 982686	valid_0's l2: 3.43376e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[971]	valid_0's l1: 982871	valid_0's l2: 3.43433e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[972]	valid_0's l1: 983258	valid_0's l2: 3.43556e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[973]	valid_0's l1: 983693	valid_0's l2: 3.43696e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[974]	valid_0's l1: 984137	valid_0's l2: 3.43839e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[975]	valid_0's l1: 984642	valid_0's l2: 3.43968e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[976]	valid_0's l1: 985119	valid_0's l2: 3.44091e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[977]	valid_0's l1: 985570	valid_0's l2: 3.44231e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[978]	valid_0's l1: 985937	valid_0's l2: 3.44337e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[979]	valid_0's l1: 986400	valid_0's l2: 3.44484e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[980]	valid_0's l1: 986848	valid_0's l2: 3.44664e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[981]	valid_0's l1: 986973	valid_0's l2: 3.44709e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[982]	valid_0's l1: 987041	valid_0's l2: 3.44727e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[983]	valid_0's l1: 987191	valid_0's l2: 3.4476e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[984]	valid_0's l1: 987341	valid_0's l2: 3.44794e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[985]	valid_0's l1: 987438	valid_0's l2: 3.44853e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[986]	valid_0's l1: 987604	valid_0's l2: 3.44879e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[987]	valid_0's l1: 987632	valid_0's l2: 3.44904e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[988]	valid_0's l1: 987780	valid_0's l2: 3.4494e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[989]	valid_0's l1: 988024	valid_0's l2: 3.44993e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[990]	valid_0's l1: 988190	valid_0's l2: 3.45031e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[991]	valid_0's l1: 988468	valid_0's l2: 3.4517e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[992]	valid_0's l1: 988744	valid_0's l2: 3.45311e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[993]	valid_0's l1: 989017	valid_0's l2: 3.45447e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[994]	valid_0's l1: 989336	valid_0's l2: 3.45579e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[995]	valid_0's l1: 989601	valid_0's l2: 3.45711e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[996]	valid_0's l1: 989925	valid_0's l2: 3.4586e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[997]	valid_0's l1: 990135	valid_0's l2: 3.45916e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[998]	valid_0's l1: 990593	valid_0's l2: 3.46048e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[999]	valid_0's l1: 990971	valid_0's l2: 3.46198e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[1000]	valid_0's l1: 991312	valid_0's l2: 3.46342e+12	valid_0's auc: 1
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[1001]	valid_0's l1: 991358	valid_0's l2: 3.46362e+12	valid_0's auc: 1
Early stopping, best iteration is:
[1]	valid_0's l1: 867844	valid_0's l2: 3.0181e+12	valid_0's auc: 1
Out[84]:
LGBMRegressor(bagging_fraction=0.7, bagging_freq=10, feature_fraction=0.9,
              learning_rate=0.005, max_bin=512, max_depth=8,
              metric=['l2', 'auc'], n_estimators=1000, num_iterations=100000,
              num_leaves=128, objective='regression', task='train', verbose=0)
In [85]:
y_pred = gbm.predict(X_test2, num_iteration=gbm.best_iteration_)

# MSE Computation
lightgbm_MSE = mean_squared_error(y_test2, y_pred)
print('The MSE of LightGBM is: ', lightgbm_MSE)

# MAE Computation
lightgbm_MAE = mean_absolute_error(y_test2, y_pred)
print('The MAE of the LightGBM model is: ', lightgbm_MAE)

# RMSE Computation
print('The RMSLE of prediction for LightGBM is:', round(mean_squared_log_error(y_test2, y_pred) ** 0.5, 5))

## Storing the new MSEs and MAEs in original list
Model_MSEs.append(lightgbm_MSE)
Model_MAEs.append(lightgbm_MAE)
The MSE of LightGBM is:  3018098107875.8013
The MAE of the LightGBM model is:  867843.7910496623
The RMSLE of prediction for LightGBM is: 0.54016

III - SVM

In [86]:
from sklearn.svm import SVR

SVR_regressor = SVR(kernel = 'rbf')
SVR_regressor.fit(X_train2, y_train2)

#predict new results
y_pred_SVR = SVR_regressor.predict(X_test2)


# MSE Computation
SVM_MSE = mean_squared_error(y_test2, y_pred_SVR)
print('The MSE of SVM is: ', SVM_MSE)

# MAE Computation
SVM_MAE = mean_absolute_error(y_test2, y_pred_SVR)
print('The MAE of the SVM model is: ', SVM_MAE)

# RMSE Computation
print('The RMSLE of prediction for SVM is:', round(mean_squared_log_error(y_test2, y_pred_SVR) ** 0.5, 5))

## Storing the new MSEs and MAEs in original list
Model_MSEs.append(SVM_MSE)
Model_MAEs.append(SVM_MAE)
The MSE of SVM is:  3750366532877.8027
The MAE of the SVM model is:  919970.7939596028
The RMSLE of prediction for SVM is: 0.68186

IV - KNN Regressor

In [87]:
from sklearn import neighbors

MSE_val = [] #to store MSE values for different k

for K in range(0,20):
    K = K+1
    model_KNN = neighbors.KNeighborsRegressor(n_neighbors = K)

    model_KNN.fit(X_train2, y_train2)  #fit the model
    pred_KNN = model_KNN.predict(X_test2) #make prediction on test set
    MSE = mean_squared_error(y_test2, pred_KNN) #calculate MSE
    MSE_val.append(MSE) #store MSE
    print('MSE value for k= ' , K , 'is:', MSE)
MSE value for k=  1 is: 8321404956966.989
MSE value for k=  2 is: 6222780952678.619
MSE value for k=  3 is: 4825288717164.992
MSE value for k=  4 is: 4293166490644.685
MSE value for k=  5 is: 3969113401814.453
MSE value for k=  6 is: 3798782460369.135
MSE value for k=  7 is: 3645461689115.9316
MSE value for k=  8 is: 3581255864290.681
MSE value for k=  9 is: 3296527693134.2715
MSE value for k=  10 is: 3303555662946.574
MSE value for k=  11 is: 3350781591686.734
MSE value for k=  12 is: 3280619165935.4873
MSE value for k=  13 is: 3239504622158.3735
MSE value for k=  14 is: 3227054291157.994
MSE value for k=  15 is: 3230737560392.5127
MSE value for k=  16 is: 3199004507679.566
MSE value for k=  17 is: 3209315803395.306
MSE value for k=  18 is: 3159733489007.9795
MSE value for k=  19 is: 3157394991300.2236
MSE value for k=  20 is: 3167443466203.4243
In [88]:
#Plotting MSE values
curve = pd.DataFrame(MSE_val) #elbow curve 
curve.plot()
Out[88]:
<matplotlib.axes._subplots.AxesSubplot at 0x1f20dec2748>

We can see that the lowest value of MSE is at K = 20

In [89]:
model_KNN = neighbors.KNeighborsRegressor(n_neighbors = 20)
model_KNN.fit(X_train2, y_train2)  #fit the model

#predict new results
y_pred_KNN = model_KNN.predict(X_test2)

# MSE Computation
KNN_MSE = mean_squared_error(y_test2, y_pred_KNN)
print('The MSE of KNN is: ', KNN_MSE)

# MAE Computation
KNN_MAE = mean_absolute_error(y_test2, y_pred_KNN)
print('The MAE of the KNN model is: ', KNN_MAE)

# RMSE Computation
print('The RMSLE of prediction for KNN is:', round(mean_squared_log_error(y_test2, y_pred_KNN) ** 0.5, 5))

## Storing the new MSEs and MAEs in original list
Model_MSEs.append(KNN_MSE)
Model_MAEs.append(KNN_MAE)
The MSE of KNN is:  3167443466203.4243
The MAE of the KNN model is:  935110.408021978
The RMSLE of prediction for KNN is: 0.59097

Visualization of all models tested

In [90]:
Model_MSEs
Out[90]:
[3729015820318.8193,
 4088391912643.5757,
 3250046319628.9155,
 3246170992333.408,
 3155399375930.833,
 3018098107875.8013,
 3750366532877.8027,
 3167443466203.4243]
In [91]:
Model_MAEs
Out[91]:
[1037114.5604395604,
 1013861.8161276642,
 981656.8229258836,
 980253.8586267291,
 944116.9487122253,
 867843.7910496623,
 919970.7939596028,
 935110.408021978]
In [92]:
Models = ['Random Forest Regressor',
          'Gradient Boosting Regressor',
          'Lasso',
          'Ridge',
         'XGBoost',
         'LightGBM',
         'SVM',
         'KNN']

d = {'Model' : Models, 'MSE': Model_MSEs}
MSE_vis = pd.DataFrame(d)

fig_dims = (20, 10)
fig, ax = plt.subplots(figsize=fig_dims)
ax = sns.barplot(x="Model", y="MSE", data=d, ax = ax)
ax.set_title("MSE of ALL models", pad=10, fontdict={'fontsize': 20})
ax.set_xlabel("Regression models",fontsize=20)
ax.set_xticklabels(ax.get_xticklabels(), rotation=30)
save_fig("MSE_allmodels")

plt.show()
Saving figure MSE_allmodels
In [93]:
Models = ['Random Forest Regressor',
          'Gradient Boosting Regressor',
          'Lasso',
          'Ridge',
         'XGBoost',
         'LightGBM',
         'SVM',
         'KNN']

d = {'Model' : Models, 'MSE': Model_MAEs}
MSE_vis = pd.DataFrame(d)

fig_dims = (20, 10)
fig, ax = plt.subplots(figsize=fig_dims)
ax = sns.barplot(x="Model", y="MSE", data=d, ax = ax)
ax.set_title("MAEs of ALL models", pad=10, fontdict={'fontsize': 20})
ax.set_xlabel("Regression models",fontsize=20)
ax.set_xticklabels(ax.get_xticklabels(), rotation=30)
save_fig("MAE_allmodels")

plt.show()
Saving figure MAE_allmodels
In [94]:
all_models_performance = {'Model' : Models, 'MSE': Model_MSEs, 'MAEs': Model_MAEs}
performance = pd.DataFrame(all_models_performance)
performance
Out[94]:
Model MSE MAEs
0 Random Forest Regressor 3.729016e+12 1.037115e+06
1 Gradient Boosting Regressor 4.088392e+12 1.013862e+06
2 Lasso 3.250046e+12 9.816568e+05
3 Ridge 3.246171e+12 9.802539e+05
4 XGBoost 3.155399e+12 9.441169e+05
5 LightGBM 3.018098e+12 8.678438e+05
6 SVM 3.750367e+12 9.199708e+05
7 KNN 3.167443e+12 9.351104e+05
In [95]:
#Rank the models based on MSE performance
performance.sort_values('MSE')
Out[95]:
Model MSE MAEs
5 LightGBM 3.018098e+12 8.678438e+05
4 XGBoost 3.155399e+12 9.441169e+05
7 KNN 3.167443e+12 9.351104e+05
3 Ridge 3.246171e+12 9.802539e+05
2 Lasso 3.250046e+12 9.816568e+05
0 Random Forest Regressor 3.729016e+12 1.037115e+06
6 SVM 3.750367e+12 9.199708e+05
1 Gradient Boosting Regressor 4.088392e+12 1.013862e+06

Best Models

From the above results, we can see that the models with the lowest MSE are :

  1. LightGBM
  2. XGBoost

These are the models that will be fined tuned to find the final model that will be evaluated with our validation (initial test) dataset


Fine Tuning Top 2 Performing Models

Here we'll take the top 2 models and evaluate their performance.

TBC:
Cross validation will be also performed on both models.

LigthGBM: Fine tuning and Time-Based Cross validation

In [96]:
#parameters that we will tune
search_params = {'learning_rate': 0.001, #learning rate
                 'max_depth': 2000, #max depth of each trained tree -- has impact on model performance and training time
                 'num_leaves': 250, #important parameter -- controls complexity of model --> num leaves = 2^ (max_depth)
                 'feature_fraction': 0.8, #column sampling - % of features selected
                 'subsample': 0.2}

#parameters that should not change
fixed_params = {'objective': 'regression',
              'metric': ['l2', 'auc'],
              'boosting':'gbdt',
              'num_boost_round':300,
              'early_stopping_rounds':30}

#def train_evaluate(search_params):
train = lgb.Dataset(X_train2, label=y_train2)
valid_data = lgb.Dataset(X_test2, label=y_test2)

params = {'metric':fixed_params['metric'],
         'objective':fixed_params['objective'],
         **search_params}

gbm_TUNED = lgb.LGBMRegressor(**params)

gbm_TUNED.fit(X_train2, y_train2,
    eval_set=[(X_test2, y_test2)],
    eval_metric='l1',
    early_stopping_rounds=10000)

num_iteration_GBM = gbm_TUNED.best_iteration_
#best_param = gbm_TUNED.best_params_
score = gbm_TUNED.best_score_
    
    #return score, num_iteration_GBM

#train_evaluate(search_params)

#print("Best parameters found: ", gbm_TUNED.best_params_)
[LightGBM] [Warning] feature_fraction is set=0.8, colsample_bytree=1.0 will be ignored. Current value: feature_fraction=0.8
[1]	valid_0's l1: 867909	valid_0's l2: 3.0182e+12	valid_0's auc: 1
Training until validation scores don't improve for 10000 rounds
[2]	valid_0's l1: 867773	valid_0's l2: 3.01806e+12	valid_0's auc: 1
[3]	valid_0's l1: 867697	valid_0's l2: 3.01809e+12	valid_0's auc: 1
[4]	valid_0's l1: 867581	valid_0's l2: 3.01838e+12	valid_0's auc: 1
[5]	valid_0's l1: 867396	valid_0's l2: 3.01829e+12	valid_0's auc: 1
[6]	valid_0's l1: 867244	valid_0's l2: 3.01836e+12	valid_0's auc: 1
[7]	valid_0's l1: 867117	valid_0's l2: 3.01826e+12	valid_0's auc: 1
[8]	valid_0's l1: 866914	valid_0's l2: 3.01825e+12	valid_0's auc: 1
[9]	valid_0's l1: 866814	valid_0's l2: 3.01846e+12	valid_0's auc: 1
[10]	valid_0's l1: 866692	valid_0's l2: 3.01848e+12	valid_0's auc: 1
[11]	valid_0's l1: 866523	valid_0's l2: 3.01835e+12	valid_0's auc: 1
[12]	valid_0's l1: 866364	valid_0's l2: 3.01813e+12	valid_0's auc: 1
[13]	valid_0's l1: 866266	valid_0's l2: 3.01819e+12	valid_0's auc: 1
[14]	valid_0's l1: 866160	valid_0's l2: 3.01814e+12	valid_0's auc: 1
[15]	valid_0's l1: 866098	valid_0's l2: 3.01811e+12	valid_0's auc: 1
[16]	valid_0's l1: 866081	valid_0's l2: 3.01826e+12	valid_0's auc: 1
[17]	valid_0's l1: 865953	valid_0's l2: 3.0182e+12	valid_0's auc: 1
[18]	valid_0's l1: 865839	valid_0's l2: 3.01822e+12	valid_0's auc: 1
[19]	valid_0's l1: 865730	valid_0's l2: 3.01818e+12	valid_0's auc: 1
[20]	valid_0's l1: 865606	valid_0's l2: 3.01814e+12	valid_0's auc: 1
[21]	valid_0's l1: 865484	valid_0's l2: 3.01809e+12	valid_0's auc: 1
[22]	valid_0's l1: 865359	valid_0's l2: 3.018e+12	valid_0's auc: 1
[23]	valid_0's l1: 865226	valid_0's l2: 3.01817e+12	valid_0's auc: 1
[24]	valid_0's l1: 865119	valid_0's l2: 3.01822e+12	valid_0's auc: 1
[25]	valid_0's l1: 865014	valid_0's l2: 3.01815e+12	valid_0's auc: 1
[26]	valid_0's l1: 864877	valid_0's l2: 3.01807e+12	valid_0's auc: 1
[27]	valid_0's l1: 864738	valid_0's l2: 3.01796e+12	valid_0's auc: 1
[28]	valid_0's l1: 864589	valid_0's l2: 3.01804e+12	valid_0's auc: 1
[29]	valid_0's l1: 864459	valid_0's l2: 3.01804e+12	valid_0's auc: 1
[30]	valid_0's l1: 864293	valid_0's l2: 3.0178e+12	valid_0's auc: 1
[31]	valid_0's l1: 864177	valid_0's l2: 3.0178e+12	valid_0's auc: 1
[32]	valid_0's l1: 864038	valid_0's l2: 3.01768e+12	valid_0's auc: 1
[33]	valid_0's l1: 863909	valid_0's l2: 3.01777e+12	valid_0's auc: 1
[34]	valid_0's l1: 863780	valid_0's l2: 3.01773e+12	valid_0's auc: 1
[35]	valid_0's l1: 863664	valid_0's l2: 3.01775e+12	valid_0's auc: 1
[36]	valid_0's l1: 863526	valid_0's l2: 3.01767e+12	valid_0's auc: 1
[37]	valid_0's l1: 863330	valid_0's l2: 3.01758e+12	valid_0's auc: 1
[38]	valid_0's l1: 863131	valid_0's l2: 3.01746e+12	valid_0's auc: 1
[39]	valid_0's l1: 863092	valid_0's l2: 3.01765e+12	valid_0's auc: 1
[40]	valid_0's l1: 862909	valid_0's l2: 3.01762e+12	valid_0's auc: 1
[41]	valid_0's l1: 862860	valid_0's l2: 3.01773e+12	valid_0's auc: 1
[42]	valid_0's l1: 862723	valid_0's l2: 3.01763e+12	valid_0's auc: 1
[43]	valid_0's l1: 862544	valid_0's l2: 3.01766e+12	valid_0's auc: 1
[44]	valid_0's l1: 862407	valid_0's l2: 3.01757e+12	valid_0's auc: 1
[45]	valid_0's l1: 862254	valid_0's l2: 3.01743e+12	valid_0's auc: 1
[46]	valid_0's l1: 862083	valid_0's l2: 3.01725e+12	valid_0's auc: 1
[47]	valid_0's l1: 861977	valid_0's l2: 3.01729e+12	valid_0's auc: 1
[48]	valid_0's l1: 861795	valid_0's l2: 3.01727e+12	valid_0's auc: 1
[49]	valid_0's l1: 861661	valid_0's l2: 3.01723e+12	valid_0's auc: 1
[50]	valid_0's l1: 861523	valid_0's l2: 3.01721e+12	valid_0's auc: 1
[51]	valid_0's l1: 861393	valid_0's l2: 3.01712e+12	valid_0's auc: 1
[52]	valid_0's l1: 861365	valid_0's l2: 3.01718e+12	valid_0's auc: 1
[53]	valid_0's l1: 861253	valid_0's l2: 3.01711e+12	valid_0's auc: 1
[54]	valid_0's l1: 861126	valid_0's l2: 3.01712e+12	valid_0's auc: 1
[55]	valid_0's l1: 860996	valid_0's l2: 3.01709e+12	valid_0's auc: 1
[56]	valid_0's l1: 861061	valid_0's l2: 3.0169e+12	valid_0's auc: 1
[57]	valid_0's l1: 860932	valid_0's l2: 3.01684e+12	valid_0's auc: 1
[58]	valid_0's l1: 860913	valid_0's l2: 3.01633e+12	valid_0's auc: 1
[59]	valid_0's l1: 860810	valid_0's l2: 3.01629e+12	valid_0's auc: 1
[60]	valid_0's l1: 860675	valid_0's l2: 3.01632e+12	valid_0's auc: 1
[61]	valid_0's l1: 860561	valid_0's l2: 3.01649e+12	valid_0's auc: 1
[62]	valid_0's l1: 860565	valid_0's l2: 3.01615e+12	valid_0's auc: 1
[63]	valid_0's l1: 860403	valid_0's l2: 3.01606e+12	valid_0's auc: 1
[64]	valid_0's l1: 860300	valid_0's l2: 3.01618e+12	valid_0's auc: 1
[65]	valid_0's l1: 860194	valid_0's l2: 3.01619e+12	valid_0's auc: 1
[66]	valid_0's l1: 860022	valid_0's l2: 3.01625e+12	valid_0's auc: 1
[67]	valid_0's l1: 859920	valid_0's l2: 3.01621e+12	valid_0's auc: 1
[68]	valid_0's l1: 859814	valid_0's l2: 3.01614e+12	valid_0's auc: 1
[69]	valid_0's l1: 859663	valid_0's l2: 3.01607e+12	valid_0's auc: 1
[70]	valid_0's l1: 859541	valid_0's l2: 3.01612e+12	valid_0's auc: 1
[71]	valid_0's l1: 859391	valid_0's l2: 3.01604e+12	valid_0's auc: 1
[72]	valid_0's l1: 859285	valid_0's l2: 3.01599e+12	valid_0's auc: 1
[73]	valid_0's l1: 859151	valid_0's l2: 3.01608e+12	valid_0's auc: 1
[74]	valid_0's l1: 859057	valid_0's l2: 3.01609e+12	valid_0's auc: 1
[75]	valid_0's l1: 859077	valid_0's l2: 3.01573e+12	valid_0's auc: 1
[76]	valid_0's l1: 858958	valid_0's l2: 3.01572e+12	valid_0's auc: 1
[77]	valid_0's l1: 858829	valid_0's l2: 3.01568e+12	valid_0's auc: 1
[78]	valid_0's l1: 858700	valid_0's l2: 3.01579e+12	valid_0's auc: 1
[79]	valid_0's l1: 858571	valid_0's l2: 3.01566e+12	valid_0's auc: 1
[80]	valid_0's l1: 858451	valid_0's l2: 3.0156e+12	valid_0's auc: 1
[81]	valid_0's l1: 858340	valid_0's l2: 3.01562e+12	valid_0's auc: 1
[82]	valid_0's l1: 858157	valid_0's l2: 3.01552e+12	valid_0's auc: 1
[83]	valid_0's l1: 858106	valid_0's l2: 3.01552e+12	valid_0's auc: 1
[84]	valid_0's l1: 857932	valid_0's l2: 3.01544e+12	valid_0's auc: 1
[85]	valid_0's l1: 857940	valid_0's l2: 3.01516e+12	valid_0's auc: 1
[86]	valid_0's l1: 857856	valid_0's l2: 3.01519e+12	valid_0's auc: 1
[87]	valid_0's l1: 857728	valid_0's l2: 3.01518e+12	valid_0's auc: 1
[88]	valid_0's l1: 857671	valid_0's l2: 3.01537e+12	valid_0's auc: 1
[89]	valid_0's l1: 857574	valid_0's l2: 3.01528e+12	valid_0's auc: 1
[90]	valid_0's l1: 857483	valid_0's l2: 3.01546e+12	valid_0's auc: 1
[91]	valid_0's l1: 857334	valid_0's l2: 3.01548e+12	valid_0's auc: 1
[92]	valid_0's l1: 857191	valid_0's l2: 3.01558e+12	valid_0's auc: 1
[93]	valid_0's l1: 857155	valid_0's l2: 3.0151e+12	valid_0's auc: 1
[94]	valid_0's l1: 857020	valid_0's l2: 3.01509e+12	valid_0's auc: 1
[95]	valid_0's l1: 856889	valid_0's l2: 3.01525e+12	valid_0's auc: 1
[96]	valid_0's l1: 856788	valid_0's l2: 3.01524e+12	valid_0's auc: 1
[97]	valid_0's l1: 856673	valid_0's l2: 3.01523e+12	valid_0's auc: 1
[98]	valid_0's l1: 856624	valid_0's l2: 3.0153e+12	valid_0's auc: 1
[99]	valid_0's l1: 856513	valid_0's l2: 3.01528e+12	valid_0's auc: 1
[100]	valid_0's l1: 856425	valid_0's l2: 3.01532e+12	valid_0's auc: 1
Did not meet early stopping. Best iteration is:
[100]	valid_0's l1: 856425	valid_0's l2: 3.01532e+12	valid_0's auc: 1
In [97]:
y_pred = gbm_TUNED.predict(X_test2, num_iteration=gbm_TUNED.best_iteration_)

# Comparing with original MSE, MAE and RMSE
print('----- LightGBM: Comparing with original MSE, MAE and RMSE -----')
print('---------------------------------------------------------------')

# MSE Computation
TUNED_lightgbm_MSE = mean_squared_error(y_test2, y_pred)
print('The MSE of tuned LightGBM is: ', TUNED_lightgbm_MSE)
print('MSE decreased: ', (1-(TUNED_lightgbm_MSE/lightgbm_MSE))*100, ' %')
print('---------------------------------------------------------------')

# MAE Computation
TUNED_lightgbm_MAE = mean_absolute_error(y_test2, y_pred)
print('The MAE of the tuned LightGBM model is: ', TUNED_lightgbm_MAE)
print('MAE decreased: ', (1-(TUNED_lightgbm_MAE/lightgbm_MAE))*100, ' %')
print('---------------------------------------------------------------')

# RMSE Computation
print('The RMSE of prediction for tuned LightGBM is:', round(mean_squared_log_error(y_test2, y_pred) ** 0.5, 5))
#print(lightgbm_MSE)
#print(TUNED_lightgbm_MSE)
----- LightGBM: Comparing with original MSE, MAE and RMSE -----
---------------------------------------------------------------
The MSE of tuned LightGBM is:  3015320817332.022
MSE decreased:  0.09202121483499193  %
---------------------------------------------------------------
The MAE of the tuned LightGBM model is:  856424.5829748743
MAE decreased:  1.3158137665508218  %
---------------------------------------------------------------
The RMSE of prediction for tuned LightGBM is: 0.53472

XGBOOST : Fine tuning and Time-Based Cross validation

In [98]:
from sklearn.model_selection import RandomizedSearchCV
import xgboost as xgb

# Create the parameter grid: gbm_param_grid
gbm_param_grid = {
    'n_estimators': [25],
    'max_depth': range(2, 12)
}

# Instantiate the regressor: gbm
xgb_TUNED = xgb.XGBRegressor(n_estimators=10)

# Perform random search: randomized_mse
randomized_mse = RandomizedSearchCV(param_distributions=gbm_param_grid, estimator=xgb_TUNED, 
                                    scoring='neg_mean_squared_error', n_iter=5, cv=4, 
                                   verbose=1)

# Fit randomized_mse to the data
randomized_mse.fit(X_train2, y_train2, eval_set=[(X_test2, y_test2)])

# Print the best parameters and lowest RMSE
print("Best parameters found: ", randomized_mse.best_params_)
print("Lowest RMSE found: ", np.sqrt(np.abs(randomized_mse.best_score_)))
Fitting 4 folds for each of 5 candidates, totalling 20 fits
[0]	validation_0-rmse:2305969.25000
[1]	validation_0-rmse:2207415.00000
[2]	validation_0-rmse:2090851.37500
[3]	validation_0-rmse:2072357.37500
[4]	validation_0-rmse:2101785.75000
[5]	validation_0-rmse:2162195.25000
[6]	validation_0-rmse:2181959.50000
[7]	validation_0-rmse:2181607.50000
[8]	validation_0-rmse:2175874.00000
[9]	validation_0-rmse:2203817.50000
[10]	validation_0-rmse:2211508.00000
[11]	validation_0-rmse:2223088.25000
[12]	validation_0-rmse:2231330.75000
[13]	validation_0-rmse:2241714.50000
[14]	validation_0-rmse:2247463.00000
[15]	validation_0-rmse:2240430.00000
[16]	validation_0-rmse:2242313.75000
[17]	validation_0-rmse:2240604.00000
[18]	validation_0-rmse:2242682.25000
[19]	validation_0-rmse:2242233.25000
[20]	validation_0-rmse:2241886.25000
[21]	validation_0-rmse:2245903.75000
[22]	validation_0-rmse:2247574.75000
[23]	validation_0-rmse:2245516.75000
[24]	validation_0-rmse:2249170.75000
[0]	validation_0-rmse:2277735.50000
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[1]	validation_0-rmse:2125757.00000
[2]	validation_0-rmse:2077120.87500
[3]	validation_0-rmse:2036837.62500
[4]	validation_0-rmse:2061448.87500
[5]	validation_0-rmse:2063506.75000
[6]	validation_0-rmse:2070335.37500
[7]	validation_0-rmse:2084084.37500
[8]	validation_0-rmse:2083673.25000
[9]	validation_0-rmse:2094144.25000
[10]	validation_0-rmse:2095160.00000
[11]	validation_0-rmse:2109293.25000
[12]	validation_0-rmse:2114233.75000
[13]	validation_0-rmse:2114560.75000
[14]	validation_0-rmse:2117615.75000
[15]	validation_0-rmse:2125555.50000
[16]	validation_0-rmse:2126717.75000
[17]	validation_0-rmse:2124461.75000
[18]	validation_0-rmse:2130208.50000
[19]	validation_0-rmse:2129898.75000
[20]	validation_0-rmse:2128244.25000
[21]	validation_0-rmse:2131081.25000
[22]	validation_0-rmse:2133882.75000
[23]	validation_0-rmse:2135873.50000
[24]	validation_0-rmse:2142432.50000
[0]	validation_0-rmse:2313586.25000
[1]	validation_0-rmse:2238876.75000
[2]	validation_0-rmse:2260152.00000
[3]	validation_0-rmse:2330059.50000
[4]	validation_0-rmse:2344397.75000
[5]	validation_0-rmse:2331451.50000
[6]	validation_0-rmse:2385339.75000
[7]	validation_0-rmse:2379440.25000
[8]	validation_0-rmse:2381468.75000
[9]	validation_0-rmse:2391759.75000
[10]	validation_0-rmse:2408897.25000
[11]	validation_0-rmse:2423456.50000
[12]	validation_0-rmse:2440786.00000
[13]	validation_0-rmse:2442211.75000
[14]	validation_0-rmse:2442581.75000
[15]	validation_0-rmse:2450410.50000
[16]	validation_0-rmse:2444457.00000
[17]	validation_0-rmse:2444781.50000
[18]	validation_0-rmse:2448224.00000
[19]	validation_0-rmse:2449340.75000
[20]	validation_0-rmse:2457060.75000
[21]	validation_0-rmse:2460673.00000
[22]	validation_0-rmse:2461318.50000
[23]	validation_0-rmse:2466573.00000
[24]	validation_0-rmse:2476796.25000
[0]	validation_0-rmse:2274123.25000
[1]	validation_0-rmse:2255507.25000
[2]	validation_0-rmse:2295588.50000
[3]	validation_0-rmse:2313103.25000
[4]	validation_0-rmse:2362502.75000
[5]	validation_0-rmse:2405323.25000
[6]	validation_0-rmse:2457345.50000
[7]	validation_0-rmse:2481799.50000
[8]	validation_0-rmse:2493627.50000
[9]	validation_0-rmse:2508933.00000
[10]	validation_0-rmse:2543107.50000
[11]	validation_0-rmse:2558449.00000
[12]	validation_0-rmse:2556964.50000
[13]	validation_0-rmse:2560947.75000
[14]	validation_0-rmse:2555507.25000
[15]	validation_0-rmse:2554037.75000
[16]	validation_0-rmse:2558081.00000
[17]	validation_0-rmse:2566189.25000
[18]	validation_0-rmse:2576657.50000
[19]	validation_0-rmse:2587924.75000
[20]	validation_0-rmse:2588754.25000
[21]	validation_0-rmse:2596007.00000
[22]	validation_0-rmse:2598787.00000
[23]	validation_0-rmse:2602458.00000
[24]	validation_0-rmse:2611824.25000
[0]	validation_0-rmse:2320949.00000
[1]	validation_0-rmse:2215517.25000
[2]	validation_0-rmse:2109041.25000
[3]	validation_0-rmse:2123025.50000
[4]	validation_0-rmse:2181153.50000
[5]	validation_0-rmse:2266651.50000
[6]	validation_0-rmse:2374822.25000
[7]	validation_0-rmse:2483848.50000
[8]	validation_0-rmse:2593120.75000
[9]	validation_0-rmse:2687324.25000
[10]	validation_0-rmse:2758284.50000
[11]	validation_0-rmse:2827064.00000
[12]	validation_0-rmse:2831244.00000
[13]	validation_0-rmse:2848447.75000
[14]	validation_0-rmse:2862597.50000
[15]	validation_0-rmse:2872344.75000
[16]	validation_0-rmse:2873912.00000
[17]	validation_0-rmse:2876304.75000
[18]	validation_0-rmse:2880345.25000
[19]	validation_0-rmse:2879368.00000
[20]	validation_0-rmse:2884489.25000
[21]	validation_0-rmse:2883981.75000
[22]	validation_0-rmse:2889667.00000
[23]	validation_0-rmse:2889104.75000
[24]	validation_0-rmse:2893059.25000
[0]	validation_0-rmse:2279906.50000
[1]	validation_0-rmse:2154632.00000
[2]	validation_0-rmse:2110526.00000
[3]	validation_0-rmse:2125875.00000
[4]	validation_0-rmse:2108316.75000
[5]	validation_0-rmse:2146384.50000
[6]	validation_0-rmse:2145842.50000
[7]	validation_0-rmse:2150690.50000
[8]	validation_0-rmse:2163318.00000
[9]	validation_0-rmse:2155963.00000
[10]	validation_0-rmse:2163567.75000
[11]	validation_0-rmse:2165900.50000
[12]	validation_0-rmse:2169204.25000
[13]	validation_0-rmse:2172157.25000
[14]	validation_0-rmse:2173509.50000
[15]	validation_0-rmse:2177462.50000
[16]	validation_0-rmse:2175985.25000
[17]	validation_0-rmse:2181348.25000
[18]	validation_0-rmse:2179484.25000
[19]	validation_0-rmse:2179490.25000
[20]	validation_0-rmse:2180285.50000
[21]	validation_0-rmse:2187015.25000
[22]	validation_0-rmse:2184962.00000
[23]	validation_0-rmse:2187936.75000
[24]	validation_0-rmse:2186721.00000
[0]	validation_0-rmse:2315426.25000
[1]	validation_0-rmse:2248505.75000
[2]	validation_0-rmse:2279921.50000
[3]	validation_0-rmse:2372335.50000
[4]	validation_0-rmse:2366589.00000
[5]	validation_0-rmse:2375266.50000
[6]	validation_0-rmse:2370033.00000
[7]	validation_0-rmse:2369106.25000
[8]	validation_0-rmse:2368687.50000
[9]	validation_0-rmse:2374956.00000
[10]	validation_0-rmse:2390477.00000
[11]	validation_0-rmse:2403109.75000
[12]	validation_0-rmse:2405052.25000
[13]	validation_0-rmse:2408894.75000
[14]	validation_0-rmse:2410240.75000
[15]	validation_0-rmse:2409081.75000
[16]	validation_0-rmse:2419053.25000
[17]	validation_0-rmse:2419678.25000
[18]	validation_0-rmse:2428875.00000
[19]	validation_0-rmse:2421941.25000
[20]	validation_0-rmse:2423234.25000
[21]	validation_0-rmse:2424940.00000
[22]	validation_0-rmse:2426370.75000
[23]	validation_0-rmse:2432291.00000
[24]	validation_0-rmse:2430858.00000
[0]	validation_0-rmse:2297907.00000
[1]	validation_0-rmse:2273955.25000
[2]	validation_0-rmse:2397557.50000
[3]	validation_0-rmse:2456421.25000
[4]	validation_0-rmse:2566401.75000
[5]	validation_0-rmse:2623710.75000
[6]	validation_0-rmse:2678972.00000
[7]	validation_0-rmse:2744704.75000
[8]	validation_0-rmse:2783356.25000
[9]	validation_0-rmse:2822401.25000
[10]	validation_0-rmse:2828917.25000
[11]	validation_0-rmse:2831376.25000
[12]	validation_0-rmse:2834156.50000
[13]	validation_0-rmse:2847794.75000
[14]	validation_0-rmse:2860971.50000
[15]	validation_0-rmse:2872023.25000
[16]	validation_0-rmse:2876773.00000
[17]	validation_0-rmse:2887399.75000
[18]	validation_0-rmse:2887948.50000
[19]	validation_0-rmse:2897918.50000
[20]	validation_0-rmse:2904793.75000
[21]	validation_0-rmse:2910649.50000
[22]	validation_0-rmse:2916909.25000
[23]	validation_0-rmse:2915264.00000
[24]	validation_0-rmse:2922395.75000
[0]	validation_0-rmse:2259056.75000
[1]	validation_0-rmse:2071656.12500
[2]	validation_0-rmse:1954938.25000
[3]	validation_0-rmse:1907633.50000
[4]	validation_0-rmse:1879737.25000
[5]	validation_0-rmse:1864304.75000
[6]	validation_0-rmse:1872892.00000
[7]	validation_0-rmse:1872838.62500
[8]	validation_0-rmse:1873251.37500
[9]	validation_0-rmse:1878864.12500
[10]	validation_0-rmse:1887877.37500
[11]	validation_0-rmse:1922763.50000
[12]	validation_0-rmse:1926339.75000
[13]	validation_0-rmse:1929673.37500
[14]	validation_0-rmse:1935202.00000
[15]	validation_0-rmse:1937761.00000
[16]	validation_0-rmse:1944345.50000
[17]	validation_0-rmse:1949454.50000
[18]	validation_0-rmse:1949627.00000
[19]	validation_0-rmse:1954825.25000
[20]	validation_0-rmse:1958095.00000
[21]	validation_0-rmse:1967122.62500
[22]	validation_0-rmse:1993231.00000
[23]	validation_0-rmse:1992221.75000
[24]	validation_0-rmse:1995566.75000
[0]	validation_0-rmse:2250726.25000
[1]	validation_0-rmse:2048716.62500
[2]	validation_0-rmse:1934108.25000
[3]	validation_0-rmse:1873591.25000
[4]	validation_0-rmse:1843736.37500
[5]	validation_0-rmse:1832034.25000
[6]	validation_0-rmse:1842726.75000
[7]	validation_0-rmse:1836495.87500
[8]	validation_0-rmse:1850359.37500
[9]	validation_0-rmse:1861305.75000
[10]	validation_0-rmse:1870196.12500
[11]	validation_0-rmse:1871945.75000
[12]	validation_0-rmse:1869448.12500
[13]	validation_0-rmse:1875138.00000
[14]	validation_0-rmse:1885242.62500
[15]	validation_0-rmse:1912875.12500
[16]	validation_0-rmse:1913781.25000
[17]	validation_0-rmse:1923086.37500
[18]	validation_0-rmse:1931954.12500
[19]	validation_0-rmse:1935478.62500
[20]	validation_0-rmse:1957742.12500
[21]	validation_0-rmse:1966063.62500
[22]	validation_0-rmse:1968358.62500
[23]	validation_0-rmse:1976923.87500
[24]	validation_0-rmse:1981808.37500
[0]	validation_0-rmse:2252256.75000
[1]	validation_0-rmse:2070402.75000
[2]	validation_0-rmse:1970334.75000
[3]	validation_0-rmse:1931466.87500
[4]	validation_0-rmse:1903113.12500
[5]	validation_0-rmse:1893674.75000
[6]	validation_0-rmse:1901122.62500
[7]	validation_0-rmse:1902212.00000
[8]	validation_0-rmse:1898097.75000
[9]	validation_0-rmse:1910776.50000
[10]	validation_0-rmse:1931021.12500
[11]	validation_0-rmse:1955868.50000
[12]	validation_0-rmse:1977627.75000
[13]	validation_0-rmse:1981806.62500
[14]	validation_0-rmse:1985582.37500
[15]	validation_0-rmse:2003338.12500
[16]	validation_0-rmse:2032094.50000
[17]	validation_0-rmse:2045178.12500
[18]	validation_0-rmse:2050162.50000
[19]	validation_0-rmse:2051220.62500
[20]	validation_0-rmse:2045657.62500
[21]	validation_0-rmse:2054810.50000
[22]	validation_0-rmse:2094846.37500
[23]	validation_0-rmse:2097562.75000
[24]	validation_0-rmse:2096306.00000
[0]	validation_0-rmse:2256284.25000
[1]	validation_0-rmse:2125804.75000
[2]	validation_0-rmse:2073804.50000
[3]	validation_0-rmse:2038440.12500
[4]	validation_0-rmse:2015392.50000
[5]	validation_0-rmse:1988518.87500
[6]	validation_0-rmse:2005024.25000
[7]	validation_0-rmse:2005450.00000
[8]	validation_0-rmse:2016660.00000
[9]	validation_0-rmse:2025875.37500
[10]	validation_0-rmse:2031463.37500
[11]	validation_0-rmse:2061639.25000
[12]	validation_0-rmse:2077956.37500
[13]	validation_0-rmse:2114299.50000
[14]	validation_0-rmse:2111098.00000
[15]	validation_0-rmse:2129834.75000
[16]	validation_0-rmse:2131761.75000
[17]	validation_0-rmse:2128517.00000
[18]	validation_0-rmse:2133366.50000
[19]	validation_0-rmse:2150815.50000
[20]	validation_0-rmse:2185128.25000
[21]	validation_0-rmse:2186118.75000
[22]	validation_0-rmse:2191050.25000
[23]	validation_0-rmse:2192831.75000
[24]	validation_0-rmse:2206435.00000
[0]	validation_0-rmse:2321846.00000
[1]	validation_0-rmse:2231651.25000
[2]	validation_0-rmse:2123390.25000
[3]	validation_0-rmse:2121298.75000
[4]	validation_0-rmse:2184836.00000
[5]	validation_0-rmse:2270543.75000
[6]	validation_0-rmse:2377485.00000
[7]	validation_0-rmse:2481762.00000
[8]	validation_0-rmse:2581135.50000
[9]	validation_0-rmse:2677473.00000
[10]	validation_0-rmse:2754256.00000
[11]	validation_0-rmse:2820333.50000
[12]	validation_0-rmse:2884447.75000
[13]	validation_0-rmse:2889652.00000
[14]	validation_0-rmse:2892063.25000
[15]	validation_0-rmse:2893510.50000
[16]	validation_0-rmse:2897382.50000
[17]	validation_0-rmse:2899057.00000
[18]	validation_0-rmse:2903129.75000
[19]	validation_0-rmse:2905050.50000
[20]	validation_0-rmse:2907667.25000
[21]	validation_0-rmse:2908744.25000
[22]	validation_0-rmse:2909458.25000
[23]	validation_0-rmse:2907976.00000
[24]	validation_0-rmse:2908273.00000
[0]	validation_0-rmse:2293300.50000
[1]	validation_0-rmse:2171406.25000
[2]	validation_0-rmse:2136467.00000
[3]	validation_0-rmse:2159793.00000
[4]	validation_0-rmse:2197408.25000
[5]	validation_0-rmse:2202485.25000
[6]	validation_0-rmse:2253433.25000
[7]	validation_0-rmse:2303828.75000
[8]	validation_0-rmse:2305695.75000
[9]	validation_0-rmse:2299744.00000
[10]	validation_0-rmse:2308518.00000
[11]	validation_0-rmse:2319122.25000
[12]	validation_0-rmse:2319760.25000
[13]	validation_0-rmse:2326093.00000
[14]	validation_0-rmse:2328017.75000
[15]	validation_0-rmse:2333495.00000
[16]	validation_0-rmse:2330931.50000
[17]	validation_0-rmse:2327340.50000
[18]	validation_0-rmse:2327786.50000
[19]	validation_0-rmse:2326863.00000
[20]	validation_0-rmse:2326576.75000
[21]	validation_0-rmse:2327533.00000
[22]	validation_0-rmse:2330282.25000
[23]	validation_0-rmse:2329596.50000
[24]	validation_0-rmse:2327059.25000
[0]	validation_0-rmse:2329629.25000
[1]	validation_0-rmse:2271377.75000
[2]	validation_0-rmse:2326298.00000
[3]	validation_0-rmse:2415294.50000
[4]	validation_0-rmse:2465712.50000
[5]	validation_0-rmse:2500940.25000
[6]	validation_0-rmse:2570803.75000
[7]	validation_0-rmse:2622096.50000
[8]	validation_0-rmse:2629403.50000
[9]	validation_0-rmse:2631020.25000
[10]	validation_0-rmse:2638092.50000
[11]	validation_0-rmse:2639299.00000
[12]	validation_0-rmse:2647815.00000
[13]	validation_0-rmse:2651448.50000
[14]	validation_0-rmse:2645108.00000
[15]	validation_0-rmse:2647389.25000
[16]	validation_0-rmse:2658640.25000
[17]	validation_0-rmse:2657580.75000
[18]	validation_0-rmse:2657435.50000
[19]	validation_0-rmse:2662934.50000
[20]	validation_0-rmse:2663788.75000
[21]	validation_0-rmse:2663189.50000
[22]	validation_0-rmse:2665018.25000
[23]	validation_0-rmse:2667248.00000
[24]	validation_0-rmse:2673186.00000
[0]	validation_0-rmse:2304299.25000
[1]	validation_0-rmse:2315696.50000
[2]	validation_0-rmse:2447269.50000
[3]	validation_0-rmse:2612068.25000
[4]	validation_0-rmse:2678230.00000
[5]	validation_0-rmse:2701499.50000
[6]	validation_0-rmse:2762580.25000
[7]	validation_0-rmse:2790480.25000
[8]	validation_0-rmse:2840717.50000
[9]	validation_0-rmse:2869560.25000
[10]	validation_0-rmse:2881093.50000
[11]	validation_0-rmse:2900041.50000
[12]	validation_0-rmse:2900598.00000
[13]	validation_0-rmse:2913748.75000
[14]	validation_0-rmse:2927103.75000
[15]	validation_0-rmse:2928723.75000
[16]	validation_0-rmse:2931569.75000
[17]	validation_0-rmse:2933571.25000
[18]	validation_0-rmse:2938173.00000
[19]	validation_0-rmse:2943588.75000
[20]	validation_0-rmse:2944118.00000
[21]	validation_0-rmse:2952033.00000
[22]	validation_0-rmse:2952912.25000
[23]	validation_0-rmse:2952384.50000
[24]	validation_0-rmse:2953697.00000
[0]	validation_0-rmse:2202666.75000
[1]	validation_0-rmse:2023141.62500
[2]	validation_0-rmse:1896711.12500
[3]	validation_0-rmse:1829293.25000
[4]	validation_0-rmse:1800703.87500
[5]	validation_0-rmse:1783431.25000
[6]	validation_0-rmse:1769310.75000
[7]	validation_0-rmse:1765571.25000
[8]	validation_0-rmse:1763530.12500
[9]	validation_0-rmse:1765678.50000
[10]	validation_0-rmse:1766481.37500
[11]	validation_0-rmse:1769147.50000
[12]	validation_0-rmse:1767120.75000
[13]	validation_0-rmse:1767803.25000
[14]	validation_0-rmse:1769508.37500
[15]	validation_0-rmse:1771902.87500
[16]	validation_0-rmse:1782947.37500
[17]	validation_0-rmse:1785498.00000
[18]	validation_0-rmse:1788225.37500
[19]	validation_0-rmse:1801357.87500
[20]	validation_0-rmse:1812773.37500
[21]	validation_0-rmse:1810444.25000
[22]	validation_0-rmse:1803368.25000
[23]	validation_0-rmse:1802699.75000
[24]	validation_0-rmse:1817435.00000
[0]	validation_0-rmse:2223077.50000
[1]	validation_0-rmse:2023684.25000
[2]	validation_0-rmse:1897926.12500
[3]	validation_0-rmse:1844971.00000
[4]	validation_0-rmse:1790687.00000
[5]	validation_0-rmse:1776909.87500
[6]	validation_0-rmse:1765683.00000
[7]	validation_0-rmse:1753476.87500
[8]	validation_0-rmse:1755103.37500
[9]	validation_0-rmse:1752100.25000
[10]	validation_0-rmse:1756088.37500
[11]	validation_0-rmse:1747712.50000
[12]	validation_0-rmse:1743484.37500
[13]	validation_0-rmse:1743375.12500
[14]	validation_0-rmse:1748432.37500
[15]	validation_0-rmse:1749440.37500
[16]	validation_0-rmse:1749123.37500
[17]	validation_0-rmse:1753837.25000
[18]	validation_0-rmse:1753270.00000
[19]	validation_0-rmse:1761412.50000
[20]	validation_0-rmse:1770647.87500
[21]	validation_0-rmse:1776711.87500
[22]	validation_0-rmse:1782102.62500
[23]	validation_0-rmse:1783755.00000
[24]	validation_0-rmse:1785251.87500
[0]	validation_0-rmse:2216312.25000
[1]	validation_0-rmse:2001798.12500
[2]	validation_0-rmse:1891087.12500
[3]	validation_0-rmse:1824857.75000
[4]	validation_0-rmse:1797998.62500
[5]	validation_0-rmse:1786637.50000
[6]	validation_0-rmse:1765043.00000
[7]	validation_0-rmse:1754946.12500
[8]	validation_0-rmse:1757539.87500
[9]	validation_0-rmse:1756096.87500
[10]	validation_0-rmse:1757276.37500
[11]	validation_0-rmse:1756988.12500
[12]	validation_0-rmse:1761331.75000
[13]	validation_0-rmse:1758049.87500
[14]	validation_0-rmse:1766931.37500
[15]	validation_0-rmse:1766794.62500
[16]	validation_0-rmse:1778482.12500
[17]	validation_0-rmse:1782544.37500
[18]	validation_0-rmse:1787579.37500
[19]	validation_0-rmse:1793817.50000
[20]	validation_0-rmse:1807053.50000
[21]	validation_0-rmse:1805497.12500
[22]	validation_0-rmse:1812873.12500
[23]	validation_0-rmse:1825330.50000
[24]	validation_0-rmse:1820360.12500
[0]	validation_0-rmse:2241953.25000
[1]	validation_0-rmse:2051895.00000
[2]	validation_0-rmse:1941130.87500
[3]	validation_0-rmse:1878193.00000
[4]	validation_0-rmse:1838474.12500
[5]	validation_0-rmse:1816640.00000
[6]	validation_0-rmse:1806001.50000
[7]	validation_0-rmse:1799820.37500
[8]	validation_0-rmse:1796824.12500
[9]	validation_0-rmse:1793777.12500
[10]	validation_0-rmse:1798654.50000
[11]	validation_0-rmse:1804621.50000
[12]	validation_0-rmse:1803866.37500
[13]	validation_0-rmse:1806077.12500
[14]	validation_0-rmse:1811613.12500
[15]	validation_0-rmse:1819661.37500
[16]	validation_0-rmse:1822850.75000
[17]	validation_0-rmse:1820656.12500
[18]	validation_0-rmse:1823793.62500
[19]	validation_0-rmse:1828113.00000
[20]	validation_0-rmse:1829992.87500
[21]	validation_0-rmse:1829959.75000
[22]	validation_0-rmse:1831722.25000
[23]	validation_0-rmse:1835950.37500
[24]	validation_0-rmse:1836365.75000
[0]	validation_0-rmse:2223738.00000
[1]	validation_0-rmse:2029287.37500
[2]	validation_0-rmse:1915409.62500
[3]	validation_0-rmse:1846056.50000
[4]	validation_0-rmse:1812634.75000
[5]	validation_0-rmse:1781107.75000
[6]	validation_0-rmse:1766052.75000
[7]	validation_0-rmse:1764338.37500
[8]	validation_0-rmse:1760356.75000
[9]	validation_0-rmse:1757428.37500
[10]	validation_0-rmse:1758482.37500
[11]	validation_0-rmse:1756544.37500
[12]	validation_0-rmse:1751739.62500
[13]	validation_0-rmse:1750661.87500
[14]	validation_0-rmse:1758464.37500
[15]	validation_0-rmse:1764680.00000
[16]	validation_0-rmse:1770737.87500
[17]	validation_0-rmse:1773389.50000
[18]	validation_0-rmse:1781289.25000
[19]	validation_0-rmse:1780631.50000
[20]	validation_0-rmse:1783814.87500
[21]	validation_0-rmse:1782903.50000
[22]	validation_0-rmse:1781429.25000
[23]	validation_0-rmse:1784587.50000
[24]	validation_0-rmse:1786058.50000
Best parameters found:  {'n_estimators': 25, 'max_depth': 2}
Lowest RMSE found:  3068223.9977987786
[Parallel(n_jobs=1)]: Done  20 out of  20 | elapsed:    3.0s finished
In [99]:
y_pred = randomized_mse.predict(X_test2)


# Comparing with original MSE, MAE and RMSE
print('----- XGBoost: Comparing with original MSE, MAE and RMSE -----')
print('---------------------------------------------------------------')

# MSE Computation
xgb_MSE_tuned = mean_squared_error(y_test2, y_pred)
print('The MSE of tuned XGBoost is: ', xgb_MSE_tuned)
print('MSE decreased: ', (1-xgb_MSE_tuned/xgb_MSE)*100, ' %')
print('---------------------------------------------------------------')

# MAE Computation
xgb_MAE_tuned = mean_absolute_error(y_test2, y_pred)
print('The MAE of the tuned XGBoost model is: ', xgb_MAE_tuned)
print('MAE decreased: ', (1-xgb_MAE_tuned/xgb_MAE)*100, ' %')
print('---------------------------------------------------------------')

# RMSE Computation
print('The RMSE of prediction for tuned XGBoost is:', round(mean_squared_log_error(y_test2, y_pred) ** 0.5, 5))
----- XGBoost: Comparing with original MSE, MAE and RMSE -----
---------------------------------------------------------------
The MSE of tuned XGBoost is:  3190005006889.0596
MSE decreased:  9.865272342689945  %
---------------------------------------------------------------
The MAE of the tuned XGBoost model is:  886358.4012362637
MAE decreased:  3.646149636637752  %
---------------------------------------------------------------
The RMSE of prediction for tuned XGBoost is: 0.57207

Results - LightGBM vs XGBoost

In [100]:
final_models = ['LightGBM',
          'XGBoost']

final_MSEs = [TUNED_lightgbm_MSE, xgb_MSE_tuned]

dff = {'Final_Model' : final_models, 'Final_MSE': final_MSEs}
MSE_vis = pd.DataFrame(dff)

fig_dims = (10, 5)
fig, ax = plt.subplots(figsize=fig_dims)
ax = sns.barplot(x="Final_Model", y="Final_MSE", data=dff, ax = ax)
ax.set_title("MSE of final models", pad=10, fontdict={'fontsize': 20})
ax.set_xlabel("Regression models",fontsize=20)
ax.set_xticklabels(ax.get_xticklabels(), rotation=30)
save_fig("MSE_finalmodels")

plt.show()
Saving figure MSE_finalmodels

Time based cross validation (time series split)

Time based cross validation performed on the best performing model : LightGBM

In [101]:
import pandas as pd
import datetime
from datetime import datetime as dt
from dateutil.relativedelta import *

class TimeBasedCV(object):
    '''
    Parameters 
    ----------
    train_period: int
        number of time units to include in each train set
        default is 30
    test_period: int
        number of time units to include in each test set
        default is 7
    freq: string
        frequency of input parameters. possible values are: days, months, years, weeks, hours, minutes, seconds
        possible values designed to be used by dateutil.relativedelta class
        deafault is days
    '''
    
    
    def __init__(self, train_period=30, test_period=7, freq='days'):
        self.train_period = train_period
        self.test_period = test_period
        self.freq = freq

        
        
    def split(self, data, validation_split_date=None, date_column='year', gap=0):
        '''
        Generate indices to split data into training and test set
        
        Parameters 
        ----------
        data: pandas DataFrame
            your data, contain one column for the record date 
        validation_split_date: datetime.date()
            first date to perform the splitting on.
            if not provided will set to be the minimum date in the data after the first training set
        date_column: string, deafult='record_date'
            date of each record
        gap: int, default=0
            for cases the test set does not come right after the train set,
            *gap* days are left between train and test sets
        
        Returns 
        -------
        train_index ,test_index: 
            list of tuples (train index, test index) similar to sklearn model selection
        '''
        
        # check that date_column exist in the data:
        try:
            data[date_column]
        except:
            raise KeyError(date_column)
                    
        train_indices_list = []
        test_indices_list = []

        if validation_split_date==None:
            validation_split_date = data[date_column].min().date() + eval('relativedelta('+self.freq+'=self.train_period)')
        
        start_train = validation_split_date - eval('relativedelta('+self.freq+'=self.train_period)')
        end_train = start_train + eval('relativedelta('+self.freq+'=self.train_period)')
        start_test = end_train + eval('relativedelta('+self.freq+'=gap)')
        end_test = start_test + eval('relativedelta('+self.freq+'=self.test_period)')

        while end_test < data[date_column].max().date():
            # train indices:
            cur_train_indices = list(data[(data[date_column].dt.date>=start_train) & 
                                     (data[date_column].dt.date<end_train)].index)

            # test indices:
            cur_test_indices = list(data[(data[date_column].dt.date>=start_test) &
                                    (data[date_column].dt.date<end_test)].index)
            
            print("Train period:",start_train,"-" , end_train, ", Test period", start_test, "-", end_test,
                  "# train records", len(cur_train_indices), ", # test records", len(cur_test_indices))

            train_indices_list.append(cur_train_indices)
            test_indices_list.append(cur_test_indices)

            # update dates:
            start_train = start_train + eval('relativedelta('+self.freq+'=self.test_period)')
            end_train = start_train + eval('relativedelta('+self.freq+'=self.train_period)')
            start_test = end_train + eval('relativedelta('+self.freq+'=gap)')
            end_test = start_test + eval('relativedelta('+self.freq+'=self.test_period)')

        # mimic sklearn output  
        index_output = [(train,test) for train,test in zip(train_indices_list,test_indices_list)]

        self.n_splits = len(index_output)
        
        return index_output
    
    
    def get_n_splits(self):
        """Returns the number of splitting iterations in the cross-validator
        Returns
        -------
        n_splits : int
            Returns the number of splitting iterations in the cross-validator.
        """
        return self.n_splits 
In [102]:
train_crossval
Out[102]:
num_speaker duration views year year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment Health Communication Education event_category_TED1900s event_category_TED2000s event_category_TED@ event_category_TED@BCG event_category_TEDGlobal event_category_TEDMED event_category_TEDOther event_category_TEDSalon event_category_TEDWomen event_category_TEDx day_Friday day_Monday day_Saturday day_Sunday day_Thurday day_Tuesday day_Wednesday month_1 month_2 month_3 month_4 month_5 month_6 month_7 month_8 month_9 month_10 month_11 month_12 day_film_Friday day_film_Monday day_film_Saturday day_film_Sunday day_film_Thurday day_film_Tuesday day_film_Wednesday month_film_1 month_film_2 month_film_3 month_film_4 month_film_5 month_film_6 month_film_7 month_film_8 month_film_9 month_film_10 month_film_11 month_film_12
0 1 19.40 47227110 2006 2006 27 149 3 1 0 1 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
1 1 16.28 3200520 2006 2006 27 233 4 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
2 1 21.43 1636292 2006 2006 16 202 3 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0
3 1 18.60 1697550 2006 2006 19 213 2 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
4 1 19.83 12005869 2006 2006 31 172 9 1 0 0 1 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1998 1 8.82 1453242 2015 2014 54 457 2 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0
1999 2 15.73 2269844 2015 2015 53 419 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0
2000 1 19.90 1117165 2015 2015 39 542 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
2001 1 9.47 1254964 2015 2015 59 547 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
2002 1 12.77 16601927 2015 2015 67 476 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0

2003 rows × 66 columns

In [103]:
#saving training data into CSV to parse with dates for cross validation 
train_crossval.to_csv('train_data_ted_crossval.csv')
In [104]:
# How to use TimeBasedCV
data_for_modeling=pd.read_csv('train_data_ted_crossval.csv', parse_dates=['year'])
data_for_modeling.head()
Out[104]:
Unnamed: 0 num_speaker duration views year year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment Health Communication Education event_category_TED1900s event_category_TED2000s event_category_TED@ event_category_TED@BCG event_category_TEDGlobal event_category_TEDMED event_category_TEDOther event_category_TEDSalon event_category_TEDWomen event_category_TEDx day_Friday day_Monday day_Saturday day_Sunday day_Thurday day_Tuesday day_Wednesday month_1 month_2 month_3 month_4 month_5 month_6 month_7 month_8 month_9 month_10 month_11 month_12 day_film_Friday day_film_Monday day_film_Saturday day_film_Sunday day_film_Thurday day_film_Tuesday day_film_Wednesday month_film_1 month_film_2 month_film_3 month_film_4 month_film_5 month_film_6 month_film_7 month_film_8 month_film_9 month_film_10 month_film_11 month_film_12
0 0 1 19.40 47227110 2006-01-01 2006 27 149 3 1 0 1 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
1 1 1 16.28 3200520 2006-01-01 2006 27 233 4 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
2 2 1 21.43 1636292 2006-01-01 2006 16 202 3 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0
3 3 1 18.60 1697550 2006-01-01 2006 19 213 2 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
4 4 1 19.83 12005869 2006-01-01 2006 31 172 9 1 0 0 1 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0
In [105]:
tscv = TimeBasedCV(train_period=10, #number of time units to include in each train set
                   test_period=2, #number of time units to include in each test set
                   freq='years') #frequency of input parameters 

tscv
Out[105]:
<__main__.TimeBasedCV at 0x1f20f32a6c8>
In [106]:
for train_index, test_index in tscv.split(data_for_modeling,validation_split_date=datetime.date(2007,1,1),date_column='year'):
    print(train_index)
    print(test_index)
Train period: 1997-01-01 - 2007-01-01 , Test period 2007-01-01 - 2009-01-01 # train records 50 , # test records 283
Train period: 1999-01-01 - 2009-01-01 , Test period 2009-01-01 - 2011-01-01 # train records 333 , # test records 421
Train period: 2001-01-01 - 2011-01-01 , Test period 2011-01-01 - 2013-01-01 # train records 754 , # test records 551
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332]
[333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753]
[754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127, 1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138, 1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149, 1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160, 1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171, 1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182, 1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193, 1194, 1195, 1196, 1197, 1198, 1199, 1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212, 1213, 1214, 1215, 1216, 1217, 1218, 1219, 1220, 1221, 1222, 1223, 1224, 1225, 1226, 1227, 1228, 1229, 1230, 1231, 1232, 1233, 1234, 1235, 1236, 1237, 1238, 1239, 1240, 1241, 1242, 1243, 1244, 1245, 1246, 1247, 1248, 1249, 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257, 1258, 1259, 1260, 1261, 1262, 1263, 1264, 1265, 1266, 1267, 1268, 1269, 1270, 1271, 1272, 1273, 1274, 1275, 1276, 1277, 1278, 1279, 1280, 1281, 1282, 1283, 1284, 1285, 1286, 1287, 1288, 1289, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1298, 1299, 1300, 1301, 1302, 1303, 1304]
In [107]:
# get number of splits
tscv.get_n_splits()
Out[107]:
3
In [108]:
chosen_features
Out[108]:
['day_film_Sunday',
 'day_film_Saturday',
 'Communication',
 'day_film_Thurday',
 'day_film_Tuesday',
 'day_film_Wednesday',
 'month_film_2',
 'month_film_5',
 'month_film_6',
 'month_film_7',
 'month_film_9',
 'month_film_10',
 'num_speaker',
 'duration',
 'year',
 'year_film',
 'title_len',
 'description_len',
 'speaker_frequency',
 'repeat_speaker',
 'Technology/Science',
 'Humanity',
 'Global Issues',
 'Art/Creativity',
 'Business',
 'Entertainment',
 'day_film_Monday',
 'day_film_Friday',
 'Education',
 'month_11',
 'day_Sunday',
 'day_Thurday',
 'event_category_TEDx',
 'month_12',
 'day_Tuesday']
In [109]:
if "year" not in chosen_features: 
    chosen_features += ["year"]
In [110]:
data_for_modeling
Out[110]:
Unnamed: 0 num_speaker duration views year year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment Health Communication Education event_category_TED1900s event_category_TED2000s event_category_TED@ event_category_TED@BCG event_category_TEDGlobal event_category_TEDMED event_category_TEDOther event_category_TEDSalon event_category_TEDWomen event_category_TEDx day_Friday day_Monday day_Saturday day_Sunday day_Thurday day_Tuesday day_Wednesday month_1 month_2 month_3 month_4 month_5 month_6 month_7 month_8 month_9 month_10 month_11 month_12 day_film_Friday day_film_Monday day_film_Saturday day_film_Sunday day_film_Thurday day_film_Tuesday day_film_Wednesday month_film_1 month_film_2 month_film_3 month_film_4 month_film_5 month_film_6 month_film_7 month_film_8 month_film_9 month_film_10 month_film_11 month_film_12
0 0 1 19.40 47227110 2006-01-01 2006 27 149 3 1 0 1 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
1 1 1 16.28 3200520 2006-01-01 2006 27 233 4 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
2 2 1 21.43 1636292 2006-01-01 2006 16 202 3 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0
3 3 1 18.60 1697550 2006-01-01 2006 19 213 2 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
4 4 1 19.83 12005869 2006-01-01 2006 31 172 9 1 0 0 1 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1998 1998 1 8.82 1453242 2015-01-01 2014 54 457 2 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0
1999 1999 2 15.73 2269844 2015-01-01 2015 53 419 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0
2000 2000 1 19.90 1117165 2015-01-01 2015 39 542 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
2001 2001 1 9.47 1254964 2015-01-01 2015 59 547 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
2002 2002 1 12.77 16601927 2015-01-01 2015 67 476 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0

2003 rows × 67 columns

In [111]:
#### Example- compute average test sets score: ####
X = data_for_modeling[chosen_features]
y = data_for_modeling['views']
In [112]:
X.head()
Out[112]:
day_film_Sunday day_film_Saturday Communication day_film_Thurday day_film_Tuesday day_film_Wednesday month_film_2 month_film_5 month_film_6 month_film_7 month_film_9 month_film_10 num_speaker duration year year_film title_len description_len speaker_frequency repeat_speaker Technology/Science Humanity Global Issues Art/Creativity Business Entertainment day_film_Monday day_film_Friday Education month_11 day_Sunday day_Thurday event_category_TEDx month_12 day_Tuesday
0 0 0 0 0 0 0 1 0 0 0 0 0 1 19.40 2006-01-01 2006 27 149 3 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 0
1 0 0 0 0 0 0 1 0 0 0 0 0 1 16.28 2006-01-01 2006 27 233 4 1 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0
2 0 0 0 1 0 0 1 0 0 0 0 0 1 21.43 2006-01-01 2006 16 202 3 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0
3 0 1 0 0 0 0 1 0 0 0 0 0 1 18.60 2006-01-01 2006 19 213 2 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 1 0 1 0 0 0 0 0 1 19.83 2006-01-01 2006 31 172 9 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1
In [113]:
from sklearn.linear_model import LinearRegression
import numpy as np

scores = []
for train_index, test_index in tscv.split(X, validation_split_date=datetime.date(2007,1,1)):

    data_train   = X.loc[train_index].drop('year', axis=1)
    target_train = y.loc[train_index]

    data_test    = X.loc[test_index].drop('year', axis=1)
    target_test  = y.loc[test_index]
   
    clf = LinearRegression()
    clf.fit(data_train,target_train)

    preds = clf.predict(data_test)

    # accuracy for the current fold only    
    r2score = clf.score(data_test,target_test)

    scores.append(r2score)
Train period: 1997-01-01 - 2007-01-01 , Test period 2007-01-01 - 2009-01-01 # train records 50 , # test records 283
Train period: 1999-01-01 - 2009-01-01 , Test period 2009-01-01 - 2011-01-01 # train records 333 , # test records 421
Train period: 2001-01-01 - 2011-01-01 , Test period 2011-01-01 - 2013-01-01 # train records 754 , # test records 551
In [114]:
#### Example- RandomizedSearchCV ####
from sklearn.model_selection import RandomizedSearchCV
from lightgbm import LGBMRegressor
from random import randint, uniform

tscv = TimeBasedCV(train_period=7, test_period=3,freq='years')
index_output = tscv.split(data_for_modeling, validation_split_date=datetime.date(2007,1,1))

lgbm = gbm_TUNED

lgbmPd = {" max_depth": [-1,2]
         }

model = RandomizedSearchCV(
    estimator = lgbm,
    param_distributions = lgbmPd,
    n_iter = 10,
    n_jobs = -1,
    iid = True,
    cv = index_output,
    verbose=5,
    pre_dispatch='2*n_jobs',
    random_state = None,
    return_train_score = True)

model.fit(X.drop('year', axis=1),y)
model.cv_results_
Train period: 2000-01-01 - 2007-01-01 , Test period 2007-01-01 - 2010-01-01 # train records 50 , # test records 488
Train period: 2003-01-01 - 2010-01-01 , Test period 2010-01-01 - 2013-01-01 # train records 538 , # test records 767
Fitting 2 folds for each of 2 candidates, totalling 4 fits
C:\Users\Sophie\anaconda3\lib\site-packages\sklearn\model_selection\_search.py:282: UserWarning:

The total space of parameters 2 is smaller than n_iter=10. Running 2 iterations. For exhaustive searches, use GridSearchCV.

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done   2 out of   4 | elapsed:    0.1s remaining:    0.1s
[Parallel(n_jobs=-1)]: Done   4 out of   4 | elapsed:    0.2s remaining:    0.0s
[Parallel(n_jobs=-1)]: Done   4 out of   4 | elapsed:    0.2s finished
C:\Users\Sophie\anaconda3\lib\site-packages\sklearn\model_selection\_search.py:849: FutureWarning:

The parameter 'iid' is deprecated in 0.22 and will be removed in 0.24.

[LightGBM] [Warning] max_depth is set=2000, max_depth=-1 will be ignored. Current value: max_depth=2000
[LightGBM] [Warning] feature_fraction is set=0.8, colsample_bytree=1.0 will be ignored. Current value: feature_fraction=0.8
Out[114]:
{'mean_fit_time': array([0.04460883, 0.04135752]),
 'std_fit_time': array([0.0186801 , 0.01894617]),
 'mean_score_time': array([0.00748396, 0.00548005]),
 'std_score_time': array([0.00050068, 0.00150132]),
 'param_ max_depth': masked_array(data=[-1, 2],
              mask=[False, False],
        fill_value='?',
             dtype=object),
 'params': [{' max_depth': -1}, {' max_depth': 2}],
 'split0_test_score': array([-1.30618764, -1.30618764]),
 'split1_test_score': array([-0.00102877, -0.00102877]),
 'mean_test_score': array([-0.50853278, -0.50853278]),
 'std_test_score': array([0.6362492, 0.6362492]),
 'rank_test_score': array([1, 1]),
 'split0_train_score': array([0.01081634, 0.01081634]),
 'split1_train_score': array([0.02311387, 0.02311387]),
 'mean_train_score': array([0.01696511, 0.01696511]),
 'std_train_score': array([0.00614877, 0.00614877])}

Feature Importance

We will perform Feature importance on the LightGBM model.

In [117]:
# LightGBM for feature importance on a regression problem
from lightgbm import LGBMRegressor
from matplotlib import pyplot

# get importance
importance = gbm_TUNED.feature_importances_
# summarize feature importance
for i,v in enumerate(importance):
    print('Feature: %0d, Score: %.5f' % (i,v))
# plot feature importance
pyplot.bar([x for x in range(len(importance))], importance)
save_fig("feature_importances")

pyplot.show()

features = pd.DataFrame(list(zip(chosen_features,importance)), columns = ['predictor','importance'])
print(features)
Feature: 0, Score: 1.00000
Feature: 1, Score: 0.00000
Feature: 2, Score: 9.00000
Feature: 3, Score: 40.00000
Feature: 4, Score: 41.00000
Feature: 5, Score: 64.00000
Feature: 6, Score: 134.00000
Feature: 7, Score: 11.00000
Feature: 8, Score: 6.00000
Feature: 9, Score: 54.00000
Feature: 10, Score: 0.00000
Feature: 11, Score: 5.00000
Feature: 12, Score: 0.00000
Feature: 13, Score: 1286.00000
Feature: 14, Score: 407.00000
Feature: 15, Score: 404.00000
Feature: 16, Score: 1019.00000
Feature: 17, Score: 1192.00000
Feature: 18, Score: 225.00000
Feature: 19, Score: 31.00000
Feature: 20, Score: 145.00000
Feature: 21, Score: 92.00000
Feature: 22, Score: 110.00000
Feature: 23, Score: 180.00000
Feature: 24, Score: 40.00000
Feature: 25, Score: 32.00000
Feature: 26, Score: 25.00000
Feature: 27, Score: 25.00000
Feature: 28, Score: 46.00000
Feature: 29, Score: 37.00000
Feature: 30, Score: 16.00000
Feature: 31, Score: 28.00000
Feature: 32, Score: 101.00000
Feature: 33, Score: 20.00000
Feature: 34, Score: 68.00000
Saving figure feature_importances
              predictor  importance
0       day_film_Sunday           1
1     day_film_Saturday           0
2         Communication           9
3      day_film_Thurday          40
4      day_film_Tuesday          41
5    day_film_Wednesday          64
6          month_film_2         134
7          month_film_5          11
8          month_film_6           6
9          month_film_7          54
10         month_film_9           0
11        month_film_10           5
12          num_speaker           0
13             duration        1286
14                 year         407
15            year_film         404
16            title_len        1019
17      description_len        1192
18    speaker_frequency         225
19       repeat_speaker          31
20   Technology/Science         145
21             Humanity          92
22        Global Issues         110
23       Art/Creativity         180
24             Business          40
25        Entertainment          32
26      day_film_Monday          25
27      day_film_Friday          25
28            Education          46
29             month_11          37
30           day_Sunday          16
31          day_Thurday          28
32  event_category_TEDx         101
33             month_12          20
34          day_Tuesday          68

Final Model Performance

Evaluating the performance of the tuned LightGBM model using our validation data, which was pre-processed using the pipeline (in previous steps in code above)

In [116]:
# X_test pre-processed through pipeline
# Make final predictions using the tuned LightGBM model 
final_predictions = gbm_TUNED.predict(X_test)
---------------------------------------------------------------
The RMSLE of prediction for tuned LightGBM is: 0.64377
In [118]:
## Evaluating the Metrics

# MSE Computation
valid_lightgbm_MSE = mean_squared_error(y_test, final_predictions)
print('The MSE of validation model: ', valid_lightgbm_MSE)
print('---------------------------------------------------------------')

# MAE Computation
valid_lightgbm_MAE = mean_absolute_error(y_test, final_predictions)
print('The MAE of validation model: ', valid_lightgbm_MAE)
print('---------------------------------------------------------------')

# RMSLE Computation
print('The RMSLE of validated model:', round(mean_squared_log_error(y_test, final_predictions) ** 0.5, 5))
The MSE of validation model:  2211810336746.4736
---------------------------------------------------------------
The MAE of validation model:  829503.1240797926
---------------------------------------------------------------
The RMSLE of validated model: 0.64377

Conclusion

While Gradient Boosting Regressor did not perform most optimally, our hypothesis was supported since tree-based methods, specifically LightGBM and XGBoost, outperformed the other models at prediction. However, this difference is quite marginal.

We cannot conclude that the current model is effective and performant at predicting the number of views of future Ted Talks, based on the metrics evaluated (MAE, MSE, RMSLE).

Contrary to what we expected, features such as duration, description length, and title length were most important for predicting the number of views, as opposed to features more directly related to the content of the talk itself, such as its topic category.